You may also view a Postscript file of the abstracts of the latest papers or see a menu for all abstracts.

Abstract: We consider the problem of planning curvature constrained paths amidst polygonal obstacles, connecting given start and target configurations. Let the critical curvature Rc be the minimal curvature for which a constrained path exists. We describe an algorithm, which approximates the critical curvature and finds a corresponding path. Further, we give an efficient decision procedure to determine if there exists a path satisfying a given curvature constraint R, with running time polynomial in |R-Rc|/R.

Abstract: We have developed a simple multi function vision system for 3D data acquisition for a wide range of applications in robotics and automation. The system uses one CCD video camera and an active directed laser light source based on a direct drive spherical pointing motor (SPM). The anatomy of the system and algorithms used are described. System calibration methods and measurements of accuracy of the outputs are presented. A list of applications is shown.

Abstract: In this paper we describe our experience with a prototype system capable of synthesizing "Supervisor Controller Programs" based largely on the theory of discrete event systems (DES) first proposed by Ramadge and Wonham. We augment the theory by also allowing continuous time trajectories modeling transitions between events. We illustrate our approach by an example, - the discrete control of a walking machine - which poses some challenges on the applicability of the theory and finally, discuss some possible solutions.

Notes: Appeared in IEEE Proceedings of the Fourth International Conference on Computer Integrated Manufacturing and Automation Technology, Troy, NY, Oct. 1994

Abstract: In this paper, we address the problem of the synthesis of controller programs for a variety of robotics and manufacturing tasks. The problem we choose for test and illustrative purposes is the standard ``Walking Machine Problem,'' a representative instance of a real "hybrid" problem with both logical/discrete and continuous properties and strong mutual influence without any reasonable separation. We aim to produce a ``compiler technology'' for this class of problems in a manner analogous to the development of the so-called ``Silicon Compilers'' for the VLSI technology. To cope with the difficulties inherent to the problem, we resort to a novel approach that combines many key ideas from a variety of disciplines: namely, ``Discrete Event Supervisory Systems'', Petri Nets approaches and ``Temporal Logic''.

Notes: Will appear in the 1995 IEEE International Conference on Robotics and Automation, Nagoya, Japan

Abstract: Iterative methods are considered for a class of saddle point problems with a penalty term arising from finite element discretizations of certain elliptic problems. An optimal preconditioner which is independent of the and the penalty parameter is constructed. This approach is then used to design an iterative method with a convergence rate independent of the Lam\'{e} parameters occuring in the equations of linear elasticity.

Please see revised version tr683.

Abstract: The followiing estimate for the Rayleigh--Ritz method is proved: $$ | \tilde \lambda - \lambda | |( \tilde u , u )| \le { \| A \tilde u - \tilde \lambda \tilde u \| } \sin \angle \{ u ; \tilde U \}, \ \| u \| =1. $$ Here $A$ is a bounded self-adjoint operator in a real Hilbert/euclidian space, $\{ \lambda, u \}$ one of its eigenpairs, $\tilde U$ a trial subspace for the Rayleigh--Ritz method, and $\{ \tilde \lambda, \tilde u \}$ a Ritz pair. %$\| u \| = \| \tilde u \| = 1.$ This inequality makes it possible to analyze the fine structure of the error of the Rayleigh--Ritz method, in particular, it shows that $ |( \tilde u , u )| \le C \epsilon^2, $ if an eigenvector $u$ is close to the trial subspace with accuracy $\epsilon$ and a Ritz vector $\tilde u$ is an $\epsilon$ approximation to another eigenvector, with a different eigenvalue. Generalizations of the estimate to the cases of eigenspaces and invariant subspaces are suggested, and estimates of approximation of eigenspaces and invariant subspaces are proved.

Abstract: Beginning with digitized volumetric data, we wish to rapidly and efficiently extract and represent surfaces defined as isosurfaces in the interpolated data. The Marching Cubes algorithm is a standard approach to this problem. We instead perform a decomposition of each 8-cell associated with a voxel into five tetrahedra, as in the Payne-Toga algorithm. Following the ideas of Kalvin, Thirion et al., and using essentially the same algorithm as Doi and Koide, we guarantee the resulting surface representation to be closed and oriented, defined by a valid triangulation of the surface of the body, which in turn is presented as a collection of tetrahedra, some of which are only partly filled. The surface is extracted as a collection of closed triangles, where each triangle is an oriented closed curve contained within a single tetrahedron. The entire surface is ``wrapped'' by the collection of triangles, using especially efficient data structures. The representation is similar to the homology theory that uses simplices embedded in a manifold to define a closed curve within each tetrahedron.

From the triangles that comprise the wrapping of the surface, we give methods to evaluate surface curvatures and principal directions at each vertex, whenever these quantities are defined. We further present a fast method for rendering and approximating the surface. The triangles form a graph structure, which is very efficiently encoded, whose nodes are the triangles and whose edges are the common edges joining adjacent triangles. We can thus identify each surface using a connected component labelling algorithm applied to the graph.

This provides a highly parallelizable approach to boundary surface representation, providing an efficiently and compact surface representation. The wrapper algorithm has been used to extract surfaces of the cranium from CT-scans and cortical surfaces from MR-scans at full resolution.

Key words: B-rep, boundary representation, Marching Cubes, tetrahedral decomposition, homology theory, surface curvature.

Abstract: This is a summary of the NSF Workshop on Manufacturing and Computational Geometry, held at the Courant Institute of Mathematical Sciences, New York University, on April 1-2, 1994. The meeting brought together about 30 participants from both the manufacturing and the computational geometry communities for the purposes of discussing current trends in the two communities, identifying areas of mutual interest, and proposing future joint activities.

Abstract: In this paper, we discuss and compare various metrics for goodness of a grasp. We study the relations and trade-offs among the goodness of a grasp, geometry of the grasped object, number of fingers and the computational complexity of the grasp-synthesis algorithms. The results here employ the techniques from convexity theory first introduced by the author and his colleagues.

Abstract: Primal and dual nondegeneracy conditions are defined for semidefinite programming. Given the existence of primal and dual solutions, it is shown that primal nondegeneracy implies a unique dual solution and that dual nondegeneracy implies a unique primal solution. The converses hold if strict complementarity is assumed. Primal and dual nondegeneracy assumptions do not imply strict complementarity, as they do in LP. The primal and dual nondegeneracy assumptions imply a range of possible ranks for primal and dual solutions $X$ and $Z$. This is in contrast with LP where nondegeneracy assumptions exactly determine the number of variables which are zero. It is shown that primal and dual nondegeneracy and strict complementarity all hold generically. Numerical experiments suggest probability distributions for the ranks of $X$ and $Z$ which are consistent with the nondegeneracy conditions.

Abstract: We consider the problem of computing the collapse state in limit analysis for a solid with a quadratic yield condition, such as, for example, the Mises condition. After discretization with the finite element method, using divergence-free elements for the plastic flow, the kinematic formulation turns into the problem of minimizing a sum of Euclidean vector norms, subject to a single linear constraint. This is a nonsmooth minimization problem, since many of the norms in the sum may vanish at the optimal point. However, efficient solution algorithms for this particular convex optimization problem have recently been developed.

The method is applied to test problems in limit analysis in two different plane models: plane strain and plates. In the first case more than 80 percent of the terms in the sum are zero in the optimal solution, causing severe ill-conditioning. In the last case all terms are nonzero. In both cases the algorithm works very well, and we solve problems which are larger by at least an order of magnitude than previously reported. The relative accuracy for the discrete problems, measured by duality gap and feasibility, is typically of the order 1.0E-8. The discretization error, due to the finite grid, depends on the nature of the solution. In the applications reported here it ranges from 1.0E-5 to 1.0E-2.

Keywords: Limit analysis, plasticity, finite element method, nonsmooth optimization.

Abstract: Iterative methods are considered for saddle point problems with penalty term. A positive definite preconditioner is constructed and it is proved that the condition number of the preconditioned system can be made independent of the discretization and the penalty parameters. Examples include the pure displacement problem in linear elasticity, the Timoshenko beam, and the Mindlin-Reissner plate.

Key words: Saddle point problems, penalty term, nearly incompressible materials, Timoshenko, Mindlin-Reissner, preconditioned conjugate residual method, multilevel, domain decomposition.

Please note: This report is a revised version of tr676.

Abstract: A family of functions F that map [0,n]->[0,n], is said to be h-wise independent if any h points in [0,n] have an image, for randomly selected f in F, that is uniformly distributed. This paper gives both probabilistic and explicit randomized constructions of (n**epsilon)-wise independent functions, for epsilon<1, that can be evaluated in constant time for the standard random access model of computation. Simple extensions give comparable behavior for larger domains. As a consequence, many probabilistic algorithms can for the first time be shown to achieve their expected asymptotic performance for a feasible model of computation.

This paper also establishes a tight tradeoff in the number of random seeds that must be precomputed for a random function that runs in time T and is h-wise independent.

Abstract: Let X be a sum of real valued random variables and have a bounded mean E[X]. The generic Chernoff-Hoeffding estimate for large deviations of X is: P{X-E[X]>=a}<=min_{y>=0}exp(-y(a+E[X]))E[exp(y X)], which applies with a>=0 to random variables with very small tails. At issue is how to use this method to attain sharp and useful estimates. We present a number of Chernoff-Hoeffding bounds for sums of random variables that may have a variety of dependent relationships and that may be heterogeneously distributed.

Abstract: Universal hash functions that exhibit (c log n)-wise independence are shown to give a performance in double hashing and virtually any reasonable generalization of double hashing that has an expected probe count of 1/(1-alpha) + epsilon for the insertion of the (alpha n)-th item into a table of size n, for any fixed alpha < 1 and epsilon > 0. This performance is within epsilon of optimal. These results are derived from a novel formulation that overestimates the expected probe count by underestimating the presence of partial items already inserted into the hash table, and from a sharp analysis of the underlying stochastic structures formed by colliding items.

Abstract: Universal hash functions that exhibit (c log n)-wise independence are shown to give a performance in double hashing, uniform hashing and virtually anyreasonable generalization of double hashing that has an expected probe count of 1/(1-alpha)+O(1/n) for the insertion of the (alpha n)-th item into a table of size n, for any fixed alpha < 1. This performance is optimal. These results are derived from a novel formulation that overestimates the expected probe count by underestimating the presence of local items already inserted into the hash table, and from a very sharp analysis of the underlying stochasticstructures formed by colliding items.

Analogous bounds are attained for the expected r-th moment of the probe count, or any fixed r, and linear probing is also shown to achieve a performance with universal hash functions that is equivalent to the fully random case.

Abstract: We note first that the first degree entailment of {\L}ukasiewicz's 3-valued logic and a 3-valued logic that is extracted of Belnap's 4-valued logic is the same. Then, we give an axiomatization of that entailment as the calculus E_{fde} + A&-A->B\/-B, where E_{fde} is the first degree entailment of Anderson-Belnap's logic E of relevance and necessity.

Abstract: This is a continuing attempt in a series of papers [ knowledge.ps, inform.ps, frame.ps ] to show how computer-represented knowledge can be arranged as elements of an effectively represented semantic domain in the sense of [C.A.Gunter and D.S.Scott, Semantic Domains, in: J. van Leeuwen (ed.), Handbook of Theoretical Computer Science, Vol. B, pp. 635--674]. We present a direct deductive description of the domain, which was defined semantically in [ knowledge.ps ], via the Scott's notion of information system. Also, the internal structure of the continuous ampliative operations coordinated with the domain's effective basis is established. Though we always remain in the paradigm of the toleration of contradictory information described in [N.D.Belnap, A Useful Four-Valued Logic: How a Computer Should Think, in: A.R.Anderson, N.D.Belnap, and J.M.Dunn, Entailment: the Logic of Relevance and Necessity, Vol. 2, Princeton Univ. Press, 1992], the approach in question could be extended to include domains for consistency knowledge bases.

Abstract: The paper continues the theme of [ knowledge.ps ]. We differentiate between our approach to knowledge representation and that of others by expressing the following Working Hypothesis: Knowledge is a data type, and knowledge revision is accomplished by continuous operations on it, which are coordinated with its effective basis. Staying in the limits of Belnap's paradigm of the admittance of contradictory information into the computer's memory, our purpose in this paper is to reduce as much as possible all the computable processes needed for modifing the current state of the computer's knowledge and describe conditions for possible maneuvering. In particular, we solve some problems of decidability concerning operations on the minimal states, which are regarded as natural knowledge transformers. We show, also, how to express those operations in lattice theory terms that leads to the simplification of their computation on the lattice of minimal states. The problem of backtracking in the presented context is considered as well.

Abstract: The goal of this paper is twofold. First, it is to present a general scheme within which information is supposed to turn into the computer-represented knowledge and, second, to define two natural kinds of transfomers of this knowledge which this scheme thrusts us into considering.

Abstract: We treat knowledge from a computer-oriented point of view, considering it as a kind of data type. An axiomatic approach to the notion of data type undertaken by Dana Scott in [D.S.Scott, Outline of a Mathematical Theory of Computation, in: Proceedings of Princeton Conference on Information Science, 1971, pp. 169--176], is explored to find entities suitable for representation techniques. At the same time, we stay in Belnap's paradigm of the toleration of inconsistency. We propose a representation of knowledge (possible with contradictions) in simple propositional language, and we show how such knowledge can be maintained and how it should be transformed on receipt of new information. In this transformation, the key role is played by Scott's continuity rather than consistency.

Abstract: We propose to provide new mathematical foundations for the design of knowledge-based systems. The underlying idea is that the knowledge which the computer (``artificial agent'') operates with is considered as a kind of abstract data type. In this context, a relation of approximation arises in a natural way when one imagines the computer as operating in a changing information environment (``information flow''). This notion of pproximation can be studied using the techniques that have been developed for domain theory in the context of denotational semantics of programming languages.

Abstract: We share with some philosophers the view that a state of knowledge, being a part of the real world, can bring contradiction into it. Such an ontological reading of knowledge is very important when one deals with information knowledge, which arises as the content of the computer's memory when the computer is placed into changeable information environment ("information flow"), and "must" be able to tolerate any (not excluding contradictions) from the computer's users. Continuing research begun in [KM 93], we consider in length one kind of Scott-continuous operations introduced there. Each such operation [A->B](x), where A and B are formulas in a propositional language, called a rule, moves the computer to a "minimal" state of knowledge, in which B is true, if in a current state A is true. Note that the notion of rule is used here in an information-transforming sense, rather than in the ordinary truth-sound sense. We distinguish between global and local rules and show that these notions are decidable. Also, we define a modal epistemic logic as a tool for the prediction of possible evolution of the system's knowledge and establish decidability of this logic.

Abstract: We consider the Dirichlet problem for the Schrodinger operator in a half-space with boundary data having an arbitrary growth at infinity. A solution is constructed as the generalized Poisson integral. Uniqueness of the solution is investigated too.

Abstract: The goal of this paper is twofold. First, it is to present a general scheme within which information is supposed to turn into the computer-represented knowledge and, second, to define two natural kinds of transfomers of this knowledge which this scheme thrusts us into considering.

Abstract: We propose to provide new mathematical foundations for the design of knowledge-based systems. The underlying idea is that the knowledge which the computer (``artificial agent'') operates with is considered as a kind of abstract data type. In this context, a relation of approximation arises in a natural way when one imagines the computer as operating in a changing information environment (``information flow''). This notion of approximation can be studied using the techniques that have been developed for domain theory in the context of denotational semantics of programming languages.

Abstract: In this paper, we describe a novel framework for expressing an optimization problem that simultaneously addresses instruction scheduling and register allocation, referred to as CRISP. By modeling spill-costs directly in the scheduling problem, CRISP permits the design of algorithms whose objective is to exploit the available instruction level parallelism --- the traditional goal of instruction scheduling --- while lowering the cost of register spilling at the same time. Currently, these optimizations are performed in separate phases and interact in ways that are not characterized very well, leading to phase-ordering problems. We also develop a fast heuristic in this paper for solving this combined optimization in the context of basic-blocks; our algorithm runs in time O ( E N) where the basic block of N instructions has E edges; this time includes all preprocessing costs. In comparison to conventional phase-ordered approaches, our combined heuristic performed well in experimental evaluations on basic-blocks with sizes in the range 5 to 100. We also present rigorous characterizations of the inherent difficulty of solving optimization problems in our CRISP framework, as well as in classical frameworks. A surprising outcome of this work is that problems expressed in CRISP are provably easier to solve, than say graph coloring --- graph coloring is the classical approach to expressing just one of the two phases of interest, namely register allocation. By eliminating the phase-ordering problem, CRISP lowers the overall complexity of the software engineering effort involved. This is because optimizers designed based on our unified approach will be relatively ``lightweight,'' when compared to those that have to cope with phase-ordering. This has positive implications both for the duration of the design cycles, as well as the concomitant costs of designing low-level optimizations in modern compilers.

Abstract: The RISC revolution has spurred the development of processors with increasing levels of instruction level parallelism (ILP). In order to realize the full potential of these processors, multiple instructions must be issued and executed in a single cycle. Consequently, instruction scheduling plays a crucial role as an optimization in this context. While early attempts at instruction scheduling were limited to compile-time approaches, the recent trend is to provide dynamic support in hardware. In this paper, we present the results of a detailed comparative study of the performance advantages to be derived by the spectrum of instruction scheduling approaches: from limited basic-block schedulers in the compiler, to novel and aggressive run-time schedulers in hardware. A significant portion of our experimental study via simulations, is devoted to understanding the performance advantages of run-time scheduling. Our results indicate it to be effective in extracting the ILP inherent to the program trace being scheduled, over a wide range of machine and program parameters. Furthermore, we also show that this effectiveness can be further enhanced by a simple basic-block scheduler in the compiler, which optimizes for the presence of the run-time scheduler in the target; current basic-block schedulers are not designed to take advantage of this feature. We demonstrate this fact by presenting a novel enhanced basic-block scheduler in this paper. Finally, we outline a simple analytical characterization of the performance advantage, that run-time schedulers have to offer.

Key words: Compile-time Optimizations, Dynamic Schedulers, Instruction Scheduling, Program Traces, Scope, Superscalar Processors

Abstract: A special case of the Dynamic Finger Conjecture is proved; this special case introduces a number of useful techniques.

Abstract: The following result is shown: On an n-node splay tree, the amortized cost of an access at distance d from the preceding access is O(log (d+1)). In addition, there is an O(n) initialization cost. The accesses include searches, insertions and deletions.

Abstract: We consider the issue of shape approximation in kinematic mechanical systems; that is, systems of rigid solid objects whose behavior can be characterized entirely in terms of the constraints that each object moves rigidly and that no two objects overlap, without considering masses or forces. The general question we address is the following: Suppose we have calculated the behavior of some kinematic system using ideal descriptions of the shapes of the objects involved. Does it then follow that a real mechanism, in which the shape of the objects approximates this ideal will have a similar behavior? In addressing this question, we present various possible definitions of what it means (a) for one shape to approximate another and (b) for the behavior of one mechanism to be similarto the behavior of another. We characterize the behavioral properties of a kinematic system in terms of its configuration space; that is, the set of physically feasible positions and orientations of the objects. We prove several existential theorems that guarantee that a sufficiently precise approximation of shape preserves significant properties of configuration space. In particular, we show that It is often possible to guarantee that the configuration space of system A is close to that of system B in terms of metric criteria by requiring that the shapes of A closely approximate those of B in terms of the dual-Hausdorff distance. It is often possible to guarantee further that the configuration space of A is topologically similar to that of B by requiring that the surface normals are close at corresponding boundary points of A and B.

Abstract: Domain decomposition preconditioners for high-order Galerkin methods in two dimensions are often built from modules associated with the decomposition of the discrete space into subspaces of functions related to the interior of elements, individual edges, and vertices. The restriction of the original bilinear form to a particular subspace gives rise to a diagonal block of the preconditioner, and the action of its inverse on a vector has to be evaluated in each iteration. Each block can be replaced by a preconditioner in order to decrease the cost. Knowledge of the quality of this local preconditioner can be used directly in a study of the convergence rate of the overall iterative process.

The Schur complement of an edge with respect to the variables interior to two adjacent elements is considered. The assembly and factorization of this block matrix are potentially expensive, especially for polynomials of high degree. It is demonstrated that the diagonal preconditioner of one such block has a condition number that increases approximately linearly with the degree of the polynomials. Numerical results demonstrate that the actual condition numbers are relatively small.

Abstract: The spectral element method is used to discretize self-adjoint elliptic equations in three dimensional domains. The domain is decomposed into hexahedral elements, and in each of the elements the discretization space is the set of polynomials of degree $N$ in each variable. A conforming Galerkin formulation is used, the corresponding integrals are computed approximately with Gauss-Lobatto-Legendre (GLL) quadrature rules of order $N$, and a Lagrange interpolation basis associated with the GLL nodes is used. Fast methods are developed for solving the resulting linear system by the preconditioned conjugate gradient method. The conforming {\it finite element} space on the GLL mesh, consisting of piecewise $Q_{1}$ or $P_1$ functions, produces a stiffness matrix $K_h$ that is known to be spectrally equivalent to the spectral element stiffness matrix $K_N$. $K_h$ is replaced by a preconditioner $\tilde{K}_h$ which is well adapted to parallel computer architectures. The preconditioned operator is then $\tilde{K}_h^{-1} K_N$.

Our techniques for non-regular meshes make it possible to estimate the condition number of $\tilde{K}_h^{-1} K_N$, where $\tilde{K}_h$ is a standard finite element preconditioner of $K_h$, based on the GLL mesh. The analysis of two finite element based preconditioners: the wirebasket method of Smith, and the overlapping Schwarz algorithm for the spectral element method, are given as examples of the use of these tools. Numerical experiments performed by Pahl are briefly discussed to illustrate the efficiency of these methods in two dimensions.

Abstract: Physical reasoning often involves approximating or abstracting the situation or the theory at hand. This paper studies the nature of approximation and abstraction as applied to the kinematic theory of rigid solid objects.

Five categories of approximation are considered: 1. Geometric approximation. 2. Abstraction of a complex kinematic structure by a simpler kinematic structure. For example, the abstraction of a collection of tightly-linked objects as a single object. 3. Abstraction of a kinematic structure by a simpler theory. For example, the abstraction by a connectivity graph in configuration space. 4. Approximation of a complex kinematic structure by a simpler structure in a more complex theory. For example, the approximation of a chain by a string. 5. Approximation of a more complex theory by a kinematic theory. For example, the approximation of solid object dynamics by kinematics.

We discuss how some of these types of approximation can be implemented and combined. We conclude that abstraction and approximation are open-ended styles of reasoning, rather than neatly categorizable meta-relationships.

Abstract: The problem of restricting a finite state model (a Kripke structure) in order to satisfy a set of unrestricted CTL formulae is named the ``Unrestricted CTL Supervisor Synthesis Problem''. The finite state model has the characteristics described in \cite{ramadge-wonham87}, that is, its transitions are partitioned between "controllable" and "uncontrollable" ones. The set of CTL formulae represents a specification of the "desired behavior" of the system, which may be achieved through a "control action". This note shows the problem to be NP-complete.

Abstract: Current APIs for multiprocessor multi-disk file systems are not easy to use in developing out-of-core algorithms that choreograph parallel data accesses. Consequently, the efficiency of these algorithms is hard to achieve in practice. We address this deficiency by specifying an API that includes data-access primitives for data choreography. With our API, the programmer can easily access specific blocks from each disk in a single operation, thereby fully utilizing the parallelism of the underlying storage system.

Our API supports the development of libraries of commonly-used higher-level routines such as matrix-matrix addition, matrix-matrix multiplication, and BMMC (bit-matrix-multiply/complement) permutations. We illustrate our API in implementations of these three high-level routines to demonstrate how easy it is to use.

Abstract: In this paper, we study various algorithmic questions regarding the computation of an optimal three finger planar grasp. We present a novel O(n^2 log n)-time algorithm to compute such an optimal grasp for an arbitrary simple n-gon. This algorithm can be used for finding ``good'' immobilizing sets. We also discuss several variations on the problem and many intriguing open questions in the area that remain unsolved.

Abstract: Let $A$ be a complex matrix with arbitrary Jordan structure, and $\lambda$ an eigenvalue of $A$ whose largest Jordan block has size $n$. We review previous results due to Lidskii, showing that the splitting of $\lambda$ under a small perturbation of $A$ of order $\epsilon$, is, generically, of order $\epsilon^{1/n}$. Explicit formulas for the leading coefficients are obtained, involving the perturbation matrix and the eigenvectors of $A$. We also present an alternative proof of Lidskii's main theorem, based on the use of the Newton diagram. This approach clarifies certain difficulties which arise in the nongeneric case, and leads, in some situations, to the extension of Lidskii's results. These results suggest a new notion of Holder condition number for multiple eigenvalues, depending only on the conditioning of the associated eigenvectors, not the conditioning of the Jordan vectors.

Abstract: An approach to the problem of developing provably correct programs has been to enrich a theorem prover for Hoare logic with decision procedures for a number of decidable sublanguages of set theory (EMLS, MLS, and extensions) and arithmetic (FPILP) (See [Schwartz, 1977]). Citing results of Goldberg (refer to [Goldberg, 79]) on average case behavior of algorithms for SAT, it was hoped that these decision procedures would perform well on average.

So far, it has been fairly difficult to prove average case NP-hardness under the various definitions (see [Levin, 86], [Ben-David et al, 89], [Blass & Gurevich, 91], [Gurevich, 91], [Venkatesan & Rajagopalan, 92], [Schuler & Yamakami, 92] and [Reischuk & Schindelhauer, 93]). We should note that the definitions in the literature haven't yet been standardized. We survey some of the results of the average case analysis of NP-complete problems, and compare the results of Goldberg with more pessimistic results. We prove that FPILP, EMLS and related fragments of set theory are NP-average complete, and show that there are simple distributions that will frustrate any algorithm for these decision problems.

Abstract: Mortar elements form a family of nonconforming finite element methods that are more flexible than conforming finite elements and are known to be as accurate as their conforming counterparts. A fast iterative method is developed for linear, second order elliptic equations in the plane. Our algorithm is modeled on a hierarchical basis preconditioner previously analyzed and tested, for conforming case, by Barry Smith and the second author. A complete analysis and results of numerical experiments are given for lower order mortar elements and geometrically conforming decompositions of the region into subregions.

Abstract: Computational real algebraic geometry studies various algorithmic questions dealing with the real solutions of a system of equalities, inequalities, and inequations of polynomials over the real numbers. This emerging field is largely motivated by the power and elegance with which it solves a broad and general class of problems arising in robotics, vision, computer aided design, geometric theorem proving, etc.

The following survey paper discusses the underlying concepts, algorithms and a series of representative applications. This paper will appear as a chapter in the "Handbook of Discrete and Computational Geometry" (Edited by J.E. Goodman and J. O'Rourke), CRC Series in Discrete and Combinatorial Mathematics.

Abstract: AI applications require the representation and manipulation of partial spatial knowledge of many different kinds. This paper argues that a representation rich in primitives but fairly restricted in logical form will suffice for many of these purposes. We present and discuss one such representation language. We demonstrate that the language is expressive enough to capture exactly or closely approximate many of the representations that have been used in the AI literature. It also contains some original constructs for dealing with collections of regions of unknown cardinality.

Abstract: In applications ranging from video reception to telecommunications and packet communication to aircraft control, tasks enter periodically and have fixed response time constraints, but missing a deadline is acceptable, provided most deadlines are met. We call such tasks ``occasionally skippable''. We look at the problem of uniprocessor scheduling of occasionally skippable periodic tasks in an environment having periodic tasks. We show that making optimal use of skips is NP-hard. We then look at two algorithms called Skip-Over Algorithms (one a variant of earliest deadline first and one of rate monotonic scheduling) that exploit skips. We give schedulability bounds for both.

Abstract: Two different preconditioners for symmetric saddle point problems with a penalty term are analyzed. The saddle point problems are discretized by mixed finite elements. The preconditioners are applied in combination with Krylov space methods. It is shown that both methods yield convergence rates that are independent from both, the discretization and the penalty parameters. The first method is based on a symmetric positive definite block-diagonal preconditioner and the second one uses a non-symmetric and indefinite block-triangular preconditioner. Numerical results are presented for a problem of linear elasticity. The preconditioners in our experiments are based on domain decomposition and multilevel techniques. It is further shown that the analysis of the symmetric positive definite preconditioner can also be applied to construct preconditioners for symmetric indefinite problems arising from second-order elliptic equations. Numerical results are presented for the Helmholtz equation.

Abstract: The spectral element method has been used extensively for the simulation of fluid flows. The resulting linear systems are often not amenable to direct methods of solution, and are especially ill-conditioned. Domain decomposition preconditioners, well adapted to the solution on parallel computers, are proposed and analyzed; both two and three space dimensions are considered.

Second-order elliptic equations are considered first, and the now well-developed theory of domain decomposition methods for finite elements is fully extended to spectral elements. This includes an analysis of exotic coarse spaces, which have proven necessary for the efficient solution of elliptic problems with large discontinuities in the coefficients, as well as a study of overlapping methods. Estimates of the condition numbers of the Schur complement restricted to an edge (in two dimensions) or to a face (in three dimensions) are also given; in particular, a fast method is designed and studied in full detail for problems with many subregions.

The Stokes problem, when restricted to the space of discrete divergence free velocities, is symmetric positive definite. A number of preconditioners are proposed, which are based on previous results for the scalar elliptic case, and new global models. The construction of a basis for the constrained velocity space is not required,and the resulting condition numbers grow only weakly with the degree $N$ and are independent of the number of subdomains.

We also consider the stationary Navier-Stokes equations, solved with Newton's method. In each iteration, a non-symmetric indefinite problem is solved using a Schwarz preconditioner. A new coarse space is proposed which satisfies the usual properties required by the elliptic theory, and also a specific $H^1$-approximation property. The rate of convergence of the algorithm grows only weakly with $N$, and does not depend on the number of subdomains, or the Newton step.

Finally, a hierarchical basis preconditioner for the mortar finite element method in two dimensions is proposed and analyzed. It is also further shown that the analysis of the symmetric positive definite preconditioner can also be applied to construct preconditioners for symmetric indefinite problems arising from second-order elliptic equations. Numerical results are presented for the Helmholtz equation.

Abstract: The information revolution that we are in the midst of has led to the use of computers controlling applications ranging from automobiles and games, to video-pumps in the information highway. These applications are distinguished by the fact that they use programs with special timing relationships between their constituent elements. For example, a program running in the microprocessor controlling an ABS system in a modern automobile must sense and react to the friction coefficient between the brake pads and the wheel at well-defined intervals of time; failure to do so will result in a systemic failure of the brakes. Referred to typically as embedded systems, these applications constitute a significant portion of the potential growth in the computer industry. However, this growth opportunity is being hampered by a lack of adequate support via software development tools, to aid the easy, rapid and correct prototyping of embedded applications.

In this report, we outline CoRReT, a COnstraint based environment for the Rapid prototyping of REal Time programs. The report outlines the overall system architecture as well as the key modules in this environment that are being currently developed. CoRReT is a scheduling centric system in that a suite of algorithms for instruction scheduling programs instrumented with real-time constraints, are at its core. These algorithms are an integral part of an (optimizing) compiler which will compile these programs automaticallywhile attempting to ensure that the timing constraints are met; when the constraints are met, the resulting schedule for the instructions is referred to be feasible. If a feasible schedule is found, it will be fed automatically into a code-generator in the back-end of the compiler. Our envisioned scheduler can --- in addition to traditional control- and data-dependence constraints in the source program --- also cope with a variety of timing constraints specified by the programmer.

Our focus is on computational platforms that embody parallelism at two levels of granularity. At the highest level, we envision a tightly-coupled parallel machine offering large-scale parallelism. In this setting, a single embedded application can be distributed across the individual processors of the cluster. Furthermore, each processor in this parallel machine can embody Instruction Level Parallelism (ILP) at a fine-grained level.

Unfortunately, due to a lack of automatic tools and technology that can provide compilation support for real-time constraints ubiquitous to embedded applications, parallel computing platforms have not proliferated in this setting. Considering the fine-grained case first, RISC processors with ILP have not yet found a niche in this domain; currently, developers of embedded systems are reluctant to embrace ILP technologies due to the onerous task of ensuring timing relationships in the program by hand --- a difficulty compounded by parallelism (at a fine-grained level) in the processor. Clearly, providing support through automation that frees the programmer of these difficulties, is a means of overcoming this challenge.

Our response to this challenge via CoRReT is to develop scheduling methodologies and tools for automatically harnessing very high performance from these platforms, in the context of embedded systems. In the absence of time-constraints, major progress has been achieved in this direction at the coarse-grained level. The situation is even better at the fine-grained level where scheduling technology is being used routinely in product-quality compilers for RISC processors.

The methodology on which CoRReT is based is independent of any particular target processor, and is applicable to third and fourth generation languages. Furthermore, we propose to use the same scheduling engines during the static analysis of the program as well as during compilation. We anticipate this ``confluence'' in the scheduling algorithms to aid in shorter prototyping cycles, since identical schedules will be used by the analysis tools and the back-end of the compiler to generate code. We envision that the algorithms and tools that go into CoRReT will naturally form an integral part of a full-fledged programming environment for prototyping real-time programs on parallel platforms.

Abstract: We consider the problem of grasping an unknown polygonal flat object using a parallel jaw gripper. Our design equips a standard gripper with several light-beam sensors (close to each jaw) and employs a control scheme based on a reactive grasping algorithm. This is done by probing the object to locate a good grasp position, and then grasping, without moving the object significantly. The goal is to do as little motion as possible to find a grasp. In this paper, we discuss an implementation of this device using NYU's MOSAIC robot, following a quick overview of the underlying reactivity principle.

Abstract: We consider the case where a pool of DNA molecules clones both, flipped and not-flipped, have been cut by restriction enzymes. Ideally, each clone is cut in the same positions, although in practice due to errors, this does not always happen. The computational problem is to determine where the cuts have occurred.

This is a key problem in determining the structure of the original DNA molecule.

A single molecule is represented by a string of 1's and 0's, with cuts represented by $1's$. A set of molecules clones (with errors) is observed, but the orientation/parity of each molecule is unknown. Clear is that the location of the observed cuts of one molecule are dependent on the parity: flipping the molecule would result in the cuts location, as observed, being ``flipped'' .

We propose a Bayesian approach to generate a posterior distribution on the cuts and parity, given the data. We first present an approximate algorithm where we attempt to divide the problem into subproblems, but it is not guaranteed to solve the problem. Then, we propose another approximate method based on a statistical framework and a mean field annealing algorithm. It computes the maximum posterior marginal (MPM estimator) and maximum aposteriori estimate (MAP estimator).

We also provide evidence that the exact solution of the problem is intractable.

Abstract: Primal-dual interior-point path-following methods for semidefinite programming (SDP) are considered. Several variants are discussed, based on Newton's method applied to three equations: primal feasibility, dual feasibility, and some form of centering condition.

The focus is on three such algorithms, called respectively the XZ, XZ+ZX and Q methods. For the XZ+ZX and Q algorithms, the Newton system is well-defined and its Jabobian is nonsingular at the solution, under nondegeneracy assumptions. The associated Schur complement matrix has an unbounded condition number on the central path, under the nondegeneracy assumptions and an additional rank assumption.

Practical aspects are discussed, including Mehrotra predictor-corrector variants and issues of numerical stability. Compared to the other methods considered, the XZ+ZX method is more robust with respect to its ability to step close to the boundary, converges more rapidly, and achieves higher accuracy.

Abstract: Many superscalar processors designed today are able to dynamically schedule instructions. Dynamic scheduling means that a processor is able to analyze a portion of the instruction stream ``on the fly'', and has the capability of issuing an instruction other than the next one available in the input, in order to avoid stalling. Such an instruction is said to be executed out of order.

Scheduling algorithms for machines with in-order execution are used in most compilers today. However, schedules which are optimal for machines with in-order execution may be sub-optimal for a machine with out-of-order execution. Here, we are presenting an algorithm which produces a local schedule for a trace of basic blocks, such that the completion time is minimized for a processor with a depth of the pipeline k=2 and dynamic scheduling ability with scope size s=2. The algorithm runs in polynomial time. A generalization of the algorithm to work for machines with larger scopes is straightforward.

Abstract: Enabled by RISC technologies, low-cost commodity microprocessors are performing at ever increasing levels, significantly via instruction level parallelism (ILP). This in turn increases the opportunities for their use in a variety of day-to-day applications ranging from the simple control of appliances such as microwave ovens, to sophisticated systems for cabin control in modern aircraft. Indeed, ``embedded'' applications such as these represent segments in the computer industry with great potential for growth. However, this growth is currently impeded by the lack of robust optimizing compiler technologies that support the assured, rapid and inexpensive prototyping of real-time software in the context of microprocessors with ILP. In this paper, we will present fast (polynomial-time) algorithms for compile-time instruction scheduling, of programs instrumented with timing-constraints, on processors with ILP. Our algorithms can be distinguished from earlier work in that they are guaranteed to find feasible schedules --- those satisfying the timing-constraints --- whenever such schedules exist, in cases of practical interest. Consequently, they can serve as powerful engines and can simultaneously support the ``analysis'' of the program prior to compilation, as well as during compilation once a feasible schedule is identified via analysis. We will also describe a novel notation, Time_tract, for specifying timing-constraints in programs, independent of the base language being used to develop the embedded application; Time_tract specifications are language independent and can be instrumented into imperative and object-oriented languages non-intrusively. As we will show, the instruction scheduling questions that arise out of Time_tract specifications are always ``tractable''. In contrast, a range of specification mechanisms proposed earlier yield substantially intractable instruction scheduling questions, thereby limiting their potential utility. We will sketch a formal and precise comparison of the tractability and related expressive power issues between Time_tract and some of the extant mechanisms for specifying properties of timed programs; this will be done using the canonical framework of timed-automata.

Abstract: The Steiner tree problem is to find the tree with minimal Euclidean length spanning a set of fixed points in the plane, given the ability to add points (Steiner points). The problem is NP-hard, so polynomial-time heuristics are desired. We present two such heuristics, both of which utilize an efficient method for optimizing a tree with a given topology. The first systematically inserts Steiner points between edges of the minimal spanning tree meeting at angles less than 120 degrees, performing a local optimization at the end. The second begins by finding the Steiner tree for three of the fixed points. Then, at each iteration, it introduces a new fixed point to the tree, connecting it to each possible edge by inserting a Steiner point, and minimizes over all connections, performing a local optimizations for each. We present a variety of test cases that demonstrate the strengths and weaknesses of both algorithms.

Abstract: Compared to other games, particularly chess, the research in computer bridge is immature, and the best bridge-playing programs are mediocre. In this paper we address the problem of designing a fast double-dummy bridge game (i.e., a simplified bridge game with perfect information) solver. Although th size of the game tree we generated for searching the best line of play is huge (about on the order of $13! \cdot 2^{39} \approx 10^{21}$, even if we assume the average branching factor for players to follow suit is just 2), we show that, through varieties of searching techniques and some efficient moves ordering and pruning heuristics, most double-dummy bridge hands can be solved within a reasonable amount of time. In this paper we first give a brief introduction to computer bridge and previous work on the card-playing phase of bridge. Next, we describe the top-level architecture of our double-dummy solver (dds), followed by a number of implementation techniques we employed in our dds. Finally we present experimental results, draw our conclusion and describe some future work toward automating card-playing in real bridge.

Abstract: We present some numerical results for a two-level additive overlapping Schwarz method applied to the 2-D Maxwell's equations. Nedelec finite elements defined on rectangles are employed. Numerical examples show the convergence properties of the method, when varying the mesh size of the coarse and fine problems, the overlap and the time step of the implicit finite difference scheme employed.

Abstract: The paradigmatic view of data in decision support consists of a set of dimensions (e.g., location, product, time period, ...), each encoding a hierarchy (e.g., location has hemisphere, country, state/province, ..., block). Typical queries consist of aggregates over a quantifiable attribute (e.g., sales) as a function of at most one attribute in each dimension of this ``data cube.'' For example, find the sum of all sales of blue polo shirts in Palm Beach during the last quarter. In this paper, we introduce an index structure for storing and indexing aggregates, called ``cube forests,'' to support such cube queries efficiently --- one index search is usually enough.

In their most general form, cube forests require a lot of space. So, we present an optimized structure, called ``hierarchically split cube forests'' that exploit the hierarchical nature of the data to save space. We then present a model and algorithms to arrive at designs that further reduce update time, but suffer an increase in query time. Our experiments bear out the model and show that the structure has promise for decision support applications in read-intensive environments.

Abstract: We develop an abstract model of memory management in distributed systems. The model is low-level enough so we can express communication, allocation and garbage collection, but otherwise hide many of the lower-level details of an actual implementation.

Recently, such formal models have been developed for memory management in a functional, sequential setting by Morrisett, Felleisen, and Harper. The models are rewriting systems whose terms are programs. Programs have both the "code" (control string) and the "store" syntactically apparent. Evaluation is expressed as conditional rewriting and includes store operations. Garbage collection becomes a rewriting relation that removes part of the store without affecting the behavior of the program.

By using techniques developed for communicating and concurrent systems such as Milner's CCS, we extend the model for a distributed environment. Sending and receiving messages is also made apparent at the syntactic level. A very general garbage collection rule based on reachability is introduced and proved correct. Now, proving correct a specific collection strategy is reduced to showing that the relation between programs defined by the strategy is a subrelation of the general relation. Any actual implementation which is capable of providing the transitions (including their atomicity constraints) specified by the strategy is therefore correct.

Abstract: Persistent Linda (PLinda) is a programming environment for writing fault-tolerant distributed/parallel programs that may be run on networks of workstations. PLinda is a set of extensions to the Linda parallel programming model and PLinda/C++ (and Fortran77 respectively) is an implementation combined with the sequential language C++.

The PLinda User Manual introduces the PLinda model, mechanics of the PLinda operations, and programming in PLinda/C++ and PLinda/Fortran77.

Abstract: The Maximum Agreement Subtree problem is the following:

Given two trees whose leaves are drawn from the same set of items (e.g., species), find the largest subset of these items so that the portions of the two trees restricted to these items are isomorphic. We consider the case which occurs frequently in practice, i.e., the case when the trees are binary, and give an O(n log n) time algorithm for this problem.

Abstract: The main goal of this paper is to give an efficient algorithm for the Tree Pattern Matching problem. We also introduce and give an efficient algorithm for the Subset Matching problem.

The Subset Matching problem is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text location is a set of characters drawn from some alphabet. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i+j-1], for all j, 1 <= j <= m. We give an O((s+n)\log^3 m) randomized algorithm for this problem, where s denotes the sum of the sizes of all the sets.

Then we reduce the Tree Pattern Matching problem to a number of instances of the Subset Matching problem. This reduction takes linear time and the sum of the sizes of the Subset Matching problems obtained is also linear. Coupled with our first result, this implies an O(nlog^3 m) time randomized of metric spaces are also considered.

Abstract: We survey the research performed during the last few years on the on-line $k$-server problem over metric spaces. A variety of algorithms are presented \mbox{--- both} deterministic and \mbox{randomized ---} and their performance is studied in the framework of competitive analysis. Restrictions of the problem to special cases of metric spaces are also considered.

Abstract: In molecular dynamics applications there is a growing interest in so-called mixed quantum-classical models. These models describe most atoms of the molecular system by the means of classical mechanics but an important, small portion of the system by the means of quantum mechanics. A particularly extensively used model, the QCMD model, consists of a singularly perturbed Schr\"odinger equation nonlinearly coupled to a classical Newtonian equation of motion.

This paper studies the singular limit of the QCMD model for finite dimensional Hilbert spaces. The main result states that this limit is given by the time-dependent Born-Oppenheimer model of quantum theory ---provided the Hamiltonian under consideration has a smooth spectral decomposition. This result is strongly related to the quantum adiabatic theorem. The proof uses the method of weak convergence by directly discussing the density matrix instead of the wave functions. This technique avoids the discussion of highly oscillatory phases.

On the other hand, the limit of the QCMD model is of a different nature if the spectral decomposition of the Hamiltonian happens not to be smooth. We will present a generic example for which the limit set is not a unique trajectory of a limit dynamical system but rather a funnel consisting of infinitely many trajectories.

Abstract: This report describes SDPpack, a package of Matlab files designed to solve semidefinite programs (SDP). SDP is a generalization of linear programming to the space of block diagonal, symmetric, positive semidefinite matrices. The main routine implements a primal--dual Mehrotra predictor--corrector scheme based on the XZ+ZX search direction. We also provide certain specialized routines, one to solve SDP's with only diagonal constraints, and one to compute the Lov\'asz $\theta$ function of a graph, using the XZ search direction. Routines are also provided to determine whether an SDP is primal or dual degenerate, and to compute the condition number of an SDP. The code optionally uses MEX files for improved performance; binaries are available for several platforms. Benchmarks show that the codes provide highly accurate solutions to a wide variety of problems.

Abstract: The mortar methods are based on domain decomposition and they allow for the coupling of different variational approximations in different subdomains. The resulting methods are nonconforming but still yield optimal approximations. In this paper, we will discuss iterative substructuring algorithms for the algebraic systems arising from the discretization of symmetric, second order, elliptic equations in two dimensions. Both spectral and finite element methods, for geometrically conforming as well as nonconforming domain decompositions, are studied. In each case, we obtain a polylogarithmic bound on the condition number of the preconditioned matrix.

Abstract: Two-level overlapping Schwarz methods are considered for finite element problems of 3D Maxwell's equations. Nedelec elements built on tetrahedra and hexahedra are considered. Once the relative overlap is fixed, the condition number of the additive Schwarz method is bounded, independently of the diameter of the triangulation and the number of subregions. A similar result is obtained for a multiplicative method. These bounds are obtained for quasi-uniform triangulations. In addition, for the Dirichlet problem, the convexity of the domain has to be assumed. Our work generalizes well-known results for conforming finite elements for second order elliptic scalar equations.

Abstract: This report describes SDPpack Version 0.9 Beta for Matlab 5.0. This version extends the previous release for semidefinite programming (SDP) to mixed semidefinite--quadratic--linear programs (SQLP), i.e.\ linear optimization problems over a product of semidefinite cones, quadratic cones and the nonnegative orthant. Together, these cones make up all possible homogeneous self-dual cones over the reals.

The main routine implements a primal--dual Mehrotra predictor--corrector scheme based on the XZ+ZX search direction for SDP. More specialized routines are also available, one to solve SDP's with diagonal constraints only, and one to compute the Lov\'asz $\theta$ function of a graph, both using the XZ search direction. Routines are also provided to determine whether an SQLP is primal or dual degenerate at its solution and whether strict complementarity holds there. Primal nondegeneracy is associated with dual uniqueness and dual nondegeneracy with primal uniqueness, though these conditions are not equivalent if strict complementarity fails to hold.

A routine is also provided to compute the condition number of an SQLP. The Matlab code calls mex files for improved performance; binaries are available for several platforms. Benchmarks show that the codes provide highly accurate solutions to a wide variety of problems.

Abstract: The ``Naive Physics Manifesto'' of Pat Hayes [1978] proposes a large-scale project of developing a formal theory encompassing the entire knowledge of physics of naive reasoners, expressed in a declarative symbolic form. The theory is organized in clusters of closely interconnected concepts and axioms. More recent work in the representation of commonsense physical knowledge has followed a somewhat different methodology. The goal has been to develop a competence theory powerful enough to justify commonsense physical inferences, and the research is organized in microworlds, each microworld covering a small range of physical phenomena. In this paper we compare the advantages and disadvantages of the two approaches. We also discuss some difficult key issues in automating commonsense physical reasoning.

Abstract: Optical Mapping is an emerging technology for constructing ordered restriction maps of DNA molecules. The underlying computational problems for this technology have been studied and several cost functions have been proposed in recent literature. Most of these propose combinatorial models; one of them also presents a probabilistic approach. However, it is not {\em a priori} clear as to how these cost functions relate to one another and to the underlying problem. We present a uniform framework for the restriction map problems where each of these various models is a specific instance of the basic framework. We achieve this by identifying the following approaches to the ordered restriction map problem: (1) using data consensus or agreement, and, (2) optimizing a characteristic function of the data. Our framework also opens up the possibility of exploring other cost functions. An additional feature is that we not only integrate the combinatorial models but also analyze the probabilistic model within the same framework. %Finally, for completeness, we include i brief survey of %the best known complexity results for these problems. Finally, we indicate the open problems by including a survey of the best known complexity results for these problems.

Abstract: Optical Mapping is an emerging technology for constructing ordered restriction maps of DNA molecules. The study of the complexity of the problems arising in Optical Mapping has generated considerable interest amongst computer science researchers. In this paper we examine the complexity of these problems.

Optical Mapping leads to various computational problems such as the Binary Flip Cut (BFC) problem, the Weighted Flip Cut (WFC) problem the Exclusive Binary Flip Cut (EBFC) problem \cite{parida1, parida2}, the Binary Shift Cut (BSC) problem, the Binary Partition Cut (BPC) problem and others. The complexity and the hardness of the BFC problem, the WFC problem were not known. Using the technique of {\em gap-preserving} reduction of the max-cut problem, we show that BFC and WFC problems are MAX SNP-hard and achieving an approximation ratio $1-\Upsilon/7$ for these problems is NP-hard, where $\Upsilon$ denotes the upper bound on the polynomial time approximation factor of the well-known max cut problem. A slight variation of BFC, BFC$_{\max K}$, had been shown to be NP-hard; we improve the result to show that BFC$_{\max K}$ is MAX SNP-hard and achieving an approximation ratio $(1-\Upsilon/7)\frac{p_{max}}{p_{min}}$ for BFC$_{\max K}$ is NP-hard, where $p_{\min}$ and $p_{\max}$ are the minimum and maximum of the digestion rates in the given problem. The EBFC problem was shown to be NP-Complete; improve this result to show that EBFC is MAX SNP-hard and achieving an approximation ratio $1-\Upsilon/7$ for EBFC is NP-hard. However, a dense instance of the EBFC problem does have a PTAS.

The Binary Partition Cut (modeling spurious molecules) problem has been shown to be NP-Complete: we show, in this paper, that a (reasonable) unrestrained version of it has an efficient polynomial time algorithm. A variation of the Binary Shift Cut (modeling missing fragments) BSC$_{\max K}$, had been shown to be NP-hard \cite{Tom}; we show both the versions of this problem to be MAX SNP-hard and achieving an approximation ratio $1-\Upsilon/6$ for BSC and a ratio $(1-\Upsilon/6)\frac{p_{max}}{p_{min}}$ for BSC$_{\max K}$ is NP-hard. In addition, we show that $d$-wise Match ($d$M) problem is MAX SNP-hard and achieving an approximation ratio $1-\Upsilon$ is NP-hard.

Abstract: Junctions are important features for image analysis and form a critical aspect of image understanding tasks such as object recognition. We present a unified approach to detecting (location of the center of the junction), classifying (by the number of wedges -- lines, corners, $3$-junctions such as $T$ or $Y$ junctions, or $4$-junctions such as $X$-junctions) and reconstructing junctions (in terms of radius size, the angles of each wedge and the intensity in each of the wedges) in images. Our main contribution is a modeling of the junction which is complex enough to handle all these issues and yet simple enough to admit an effective dynamic programming solution. Broadly, we use a template deformation framework along with a gradient criterium to detect radial partitions of the template. We use the Minimum Description Length (MDL) principle to obtain the optimal number of partitions that best describes the junction.

Kona is an implementation of this model. We (quantitatively) demonstrate the stability and robustness of the detector by analyzing its behavior in the presence of noise, using synthetic/controlled apparatus. We also present a qualitative study of its behavior on real images.

Abstract: Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and top-down directions. The main search direction is still bottom-up but a restricted search is conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. As a very important characteristic of the algorithm, it is not necessary to explicitly examine every frequent itemset. Therefore it performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, which therefore specifies immediately all frequent itemsets. We evaluate the performance of the algorithm using a well-known benchmark database. The improvements can be up to several orders of magnitude, compared to the best current algorithms.

Abstract: In this thesis, we study iterative substructuring methods for linear elliptic problems approximated by the $p$-version finite element method. They form a class of nonoverlapping domain decomposition methods, for which the information exchange between neighboring subdomains is limited to the variables directly associated with the interface, i.e. those common to more than one subregion. Our objective is to design algorithms in $3D$ for which we can find an upper bound for the {\it condition number} $\kappa$ of the preconditioned linear system, which is independent of the number of subdomains and grows slowly with $p$.

Iterative substructuring methods for the $h-$version finite element, and spectral elements have previously been developed and analysed by several authors. However, some very real difficulties remained when the extension of these methods and their analysis to the $p-$version finite element method were attempted, such as a lack extension theorems for polynomials. The corresponding results are well known for Sobolev spaces, but their extension to finite element spaces is quite intricate. In our technical work, we use and further develop extension theorems for polynomials in order to prove bounds on the condition numbers of several algorithms.

We have also made many numerical tests. We can use our programs for several purposes. Not only can we compute the condition numbers and study the rate of convergence for a variety of the algorithms that we have developed, but we can also compute the bounds on these condition numbers, as given by the theory. This is useful because the theory predicts the order of magnitude actual condition numbers.

Abstract: An iterative substructuring method for the system of linear elasticity in three dimensions is introduced and analyzed. The pure displacement formulation for compressible materials is discretized with the spectral element method. The resulting stiffness matrix is symmetric and positive definite.

The method proposed provides a domain decomposition preconditioner constructed from local solvers for the interior of each element, and for each face of the elements and a coarse, global solver related to the wire basket of the elements. As in the scalar case, the condition number of the preconditioned operator is independent of the number of spectral elements and grows as the square of the logarithm of the spectral degree.

Abstract: We propose a new natural memory consistency model, Smile consistency. Not only does Smile provide an intuitive memory consistency model but also a paradigm in which users can define their own synchronization primitives, called synchronization classes. Programmers can use the synchronization class to ease the programming work related to basic synchronization operations. Therefore, in addition to shared memory, threads can also communicate with each other via synchronization objects, instances of synchronization classes. Programs with high-level synchronization objects may also outperform those with only basic synchronization primitives.

Abstract: This paper is intended as a survey of current results on algorithmic and theoretical aspects of overlapping Schwarz methods for discrete $\Hcurl$ and $\Hdiv$--elliptic problems set in suitable finite element spaces. The emphasis is on a unified framework for the motivation and theoretical study of the various approaches developed in recent years.

Generalized Helmholtz decompositions -- orthogonal decompositions into the null space of the relevant differential operator and its complement -- are crucial in our considerations. It turns out that the decompositions the Schwarz methods are based upon have to be designed separately for both components. In the case of the null space, the construction has to rely on liftings into spaces of discrete potentials.

Taking the cue from well-known Schwarz schemes for second order elliptic problems, we devise uniformly stable splittings of both parts of the Helmholtz decomposition. They immediately give rise to powerful preconditioners and iterative solvers.

Abstract: Spectral element methods are considered for symmetric elliptic systems of second-order partial differential equations, such as the linear elasticity and the Stokes systems in three dimensions. The resulting discrete problems can be positive definite, as in the case of compressible elasticity in pure displacement form, or saddle point problems, as in the case of almost incompressible elasticity in mixed form and Stokes equations. Iterative substructuring algorithms are developed for both cases. They are domain decomposition preconditioners constructed from local solvers for the interior of each element and for each face of the elements and a coarse, global solver related to the wire basket of the elements. In the positive definite case, the condition number of the resulting preconditioned operator is independent of the number of spectral elements and grows at most in proportion to the square of the logarithm of the spectral degree. For saddle point problems, there is an additional factor in the estimate of the condition number, namely, the inverse of the discrete inf-sup constant of the problem.

Abstract: While Java and applets have created a new perspective for Web applications, some problems are still unsolved. Among these are the question of how Java applets can find other members of the collaboration session, how to deal with the restrictions imposed by the Java security model, and how to overcome the inability of applets to communicate directly, even if they belong to the same distributed application. KnittingFactory addresses the problem of finding other members of a collaboration session by providing a distributed registry system where the search is performed within a Web browser without violating its security model; the problem of arbitrary placement of applications by providing the core functionality for downloading applets from an arbitrary node; and finally the problem of direct applet-applet communication by using the Java Remote Method Invocation mechanisms to give applets information on how their fellow applets can be reached. Two example applications validate this concept and demonstrate the ease of use of KnittingFactory.

Abstract: Hierarchical a posteriori error estimators are introduced and analyzed for mortar finite element methods. A weak continuity condition at the interfaces is enforced by means of Lagrange multipliers. The two proposed error estimators are based on a defect correction in higher order finite element spaces and an adequate hierarchical two-level splitting. The first provides upper and lower bounds for the discrete energy norm of the mortar finite element solution whereas the second also estimates the error for the Lagrange multiplier. It is shown that an appropriate measure for the nonconformity of the mortar finite element solution is the weighted $L^2$-norm of the jumps across the interfaces.

Abstract: In this paper, we introduce and analyze a special mortar finite element method. We restrict ourselves to the case of two disjoint subdomains, and use Raviart-Thomas finite elements in one subdomain and conforming finite elements in the other. In particular, this might be interesting for the coupling of different models and materials. Because of the different role of Dirichlet and Neumann boundary conditions a variational formulation without a Lagrange multiplier can be presented. It can be shown that no matching conditions for the discrete finite element spaces are necessary at the interface. Using static condensation, a coupling of conforming finite elements and enriched nonconforming Crouzeix-Raviart elements satisfying Dirichlet boundary conditions at the interface is obtained. The Dirichlet problem is then extended to a variational problem on the whole nonconforming ansatz space. It can be shown that this is equivalent to a standard mortar coupling between conforming and Crouzeix-Raviart finite elements where the Lagrange multiplier lives on the side of the Crouzeix-Raviart elements. We note that the Lagrange multiplier represents an approximation of the Neumann boundary condition at the interface. Finally, we present some numerical results and sketch the ideas of the algorithm. The arising saddle point problems is be solved by multigrid techniques with transforming smoothers.

Abstract: In this paper, we consider efficient multilevel based iterative solvers and efficient and reliable a posteriori error estimators for mixed hybrid and macro-hybrid finite element discretizations of elliptic boundary value problems. We give an overview concerning the state-of-the-art techniques for these nonconforming approaches and illustrate the performance of the adaptivity concepts realized by some selected numerical examples.

Abstract: With the rapid growth of the World Wide Web, clients attempting to access some popular web sites are experiencing slow response times due to server load and network congestion. Replacing the single server machine with a set of replicated servers is a cost-effective solution to partition server load which also allows incremental scalability and fault transparency. Distributing these replicated servers geographically can reduce network congestion and increase availability. However, distributed web sites are faced with the issue of allocating servers: how do clients find out about the replicas and how do they decide which one to contact? Popular web sites have well publicized server names and require a transparent mapping of the public server name to replicated servers.

Unlike most traditional approaches, we propose a technique which pushes the server allocation functionality onto the client. We argue that this approach scales well and results in increased performance in many cases. Building on theoretical work based on game theory, we show that the usage of individual replicas can be effectively controlled with cost functions even when the clients are noncooperative. We present the design and implementation of WebSeAl, our prototype system realizing these techniques. WebSeAl does not require any changes to existing client and server code, conforms to all standards, and does not generate any control messages. Preliminary experiments utilizing servers on six continents and in controlled settings indicate that WebSeal improves performance significantly while imposing little overhead.

Abstract: In this paper, prepared for the proceedings of the international conference on domain decomposition held in Boulder, CO in August 1997, we give a progress report on the development of a new family of domain decomposition methods for the solution of Helmholtz's equation.

We present three algorithms based on overlapping Schwarz methods; in our favorite method we proceed to the continuous finite element approximation of the Helmholtz's equation through a sequence of discontinuous iterates. While this is, quite possibly, a new type of overlapping Schwarz methods, we have been inspired to develop this idea by the thesis of Bruno Despr\'{e}s.

Abstract: Order of magnitude reasoning --- reasoning by rough comparisons of the sizes of quantities --- is often called ``back of the envelope calculation", with the implication that the calculations are quick though approximate. This paper exhibits an interesting class of constraint sets in which order of magnitude reasoning is demonstrably much faster than ordinary quantitative reasoning. Specifically, we present a polynomial-time algorithm that can solve a set of constraints of the form ``Points a and b are much closer together than points c and d.'' We prove that this algorithm can be applied if ``much closer together'' is interpreted either as referring to an infinite difference in scale or as referring to a finite difference in scale, as long as the difference in scale is greater than the number of variables in the constraint set. We also prove the first-order theory over such constraints is decidable.

Abstract: Iterative substructuring methods are introduced and analyzed for saddle point problems with a penalty term. Two examples of saddle point problems are considered: the mixed formulation of the linear elasticity system and the generalized Stokes system in three dimensions. These problems are discretized with mixed spectral element methods. The resulting stiffness matrices are symmetric and indefinite. The unknowns interior to each element are first implicitly eliminated by using exact local solvers. The resulting saddle point Schur complement is solved with a Krylov space method with block preconditioners. The velocity block can be approximated by a domain decomposition method, e.g., of wire basket type, which is constructed from local solvers for each face of the elements, and a coarse solver related to the wire basket of the elements. The condition number of the preconditioned operator is independent of the number of spectral elements and is bounded from above by the product of the square of the logarithm of the spectral degree and the inverse of the discrete inf-sup constant of the problem.

Abstract: The aim of this paper is twofold: firstly, to present a survey of the theory of games, and secondly, to contrast deductive and inductive reasoning in game theory. This report begins with an overview of the classical theory of strategic form games of complete information. This theory is based on the traditional economic assumption of rationality, common knowledge of which yields Nash equilibrium as a deductive solution to games in this class. In the second half of this paper, modern game-theoretic ideas are introduced. In particular, learning and repeated games are analyzed using an inductive model, in the absence of common knowledge. In general, inductive reasoning does not gives rise to the Nash equilbrium when learning is deterministic, unless initial beliefs are somehow fortuitously chosen. However, computer simulations show that in the presence of a small random component, repeated play does indeed converge to Nash equilibrium. This research is of interest to computer scientists because modern game theory is a natural framework in which to formally study multi-agent systems and distributed computing.

Abstract: This report includes a modern account of welfare economics and competitive equilibrium theory. In particular, competitive, or Walrasian, equilibrium is defined. Moreover, existence, optimality, and uniqueness are demonstrated. However, no reliable mechanism for computing equilibrium prices is suggested. At this stage, the problem shifts from the realm of economics to an algorithmic problem in computer science.

Abstract: The idea of learning to play equilibrium strategies in repeated games is an active area of research in the game-theoretic community. Game theorists are primarily concerned with the equilibrium outcomes of learning algorithms in the limit: i.e., over an infinite amount of time. One of the goals of this research is to apply computer science ideology to learning theory. In particular, this thesis will consider imposing restrictions on traditional game-theoretic learning algorithms such that players learn to play approximations to equilibrium strategies in bounded amounts of time. The idea of such bounded learning algorithms is to quickly learn to exploit the obvious, while ignoring any subtleties.

The idea of bounded learning is applicable to network games, in which players learn to utilize networks during times of minimal congestion. These games are atypical as compared with traditional games described in the game-theoretic literature, since their underlying structure is not commonly understood by the players, and moreover, common knowledge of rationality is not a valid assumption. As such, this class of repeated games does not naturally lend itself to belief-based learning algorithms. Rather, this thesis will investigate learning algorithms for network games that are analyzed on the basis of performance, without requiring that players maintain prior beliefs about expected network congestion. In sum, the initial focus of this thesis is to explore an application of computer science ideology to learning algorithms in game theory; secondly, bounded game-theoretic learning will be applied to routing and congestion problems in network environments.

Abstract: In this paper, we extend an algorithmic approach to constructing ordered restriction maps from images of a population of individual DNA molecules (clones) digested by restriction enzymes. The original algorithm was capable of producing high-resolution, high-accuracy maps rapidly and in a scalable manner given a certain class of data errors, including contamination, sizing errors, false and missing restriction sites and unknown orientation. Here we extend this set of errors to include possibly broken molecules where the amount of breakage is not known beforehand, which is necessary for handling larger clones. In an earlier paper~\cite{optmapII}, we had shown that the problem of making maps from molecules with end fragments missing as the only source of error is NP-complete. We also show how to handle multiple reliability levels in the input data when calling restriction sites, where the actual reliability levels are not known and must be infered from the data.

Abstract: In this paper, we describe our algorithmic approach to constructing an alignment (Contig) of a set of optical maps created from the images of individual genomic DNA molecules digested by restriction enzymes. Generally, these DNA segments are sized in the range of 1--4 Mb. The problem of assembling clone contig maps is a simpler special case of this contig problem and is handled by our algorithms. The goal is to devise contiging algorithms capable of producing high-quality composite maps rapidly and in a scalable manner. The resulting software is a key component of our physical mapping automation tools and has been used routinely to create composite maps of various microorganisms (E.coli, P.falciparum and D.radioduran). The experimental results appear highly promising.

Abstract: We present a simple technique for predicting the probability that an idle workstation will continue to be idle for $i$ minutes, given that it has been idle for $x$ minutes (i.e., find the {\em remaining idle period probability} $P(i;x)$). By idle we mean that the workstation owner is not interactively using the workstation or executing other tasks on it. The results are particularly applicable to the scheduling of tasks in systems that harvest cycles from idle-only workstations. Our Remaining Idle Period Probability Predictor (RIPPP) uses the distribution of the lengths of idle periods on the managed workstations. Collecting, storing, and processing these distributions (in the form of histograms) is a small overhead on modern workstations (a few kilobytes of storage per workstation).

We investigated the behavior of our RIPPP with usage traces of 31 workstations collected over a five month period, and discovered the following six results. (1) The distribution of one month of idle periods predicts the remaining idle period probability in the next month for most workstations. (2) Different workstations tend to have significantly different idle period length distributions. (3) The average length of an idle period does not necessarily correlate well with the probability of being able to find long idle periods, contrary to intuition and previous scheduling heuristics. (4) A workstation that has been idle a long time does not necessarily have a high probability of remaining idle for a long time. (5) Using the time of day can improve predictions. (6) The length of the previous and the current idle periods are positively correlated, but the length of the previous idle period is not strongly correlated with finding long remaining idle periods.

Based on these studies, we conclude that an effective way to find idle workstations is to collect their idle period length distribution and use it to compute $P(i;x)$. We believe our analysis will be applicable to predicting the length of busy periods, which is useful for deciding whether to migrate or suspend tasks when a workstation becomes busy (the owner reclaims it).

From our results, we have developed a remaining idle period probability toolkit which includes a statistics collector and a prediction library in C. This will be available from our project homepage.

Abstract: This paper presents the design and the implementation of a resource management system for monitoring computing resources on a network and for dynamically allocating them to concurrently executing jobs. In particular, it is designed to support adaptive parallel computations---computations that benefit from addition of new machines, and can tolerate removal of machines while executing. The challenge for such a resource manager is to communicate the availability of resources to running programs even when the programs were not developed to work with external resource managers. Our main contribution is a novel mechanism addressing this issue, built on low-level features common to popular parallel programming systems.

Existing resource management systems for adaptive computations either require tight integration with the operating system (DRMS), or require an integration with a programming system that is aware of external resource managers (e.g. Condor/CARMI, MPVM, Piranha). Thus in each case, their support is limited to a single type of programming system. In contrast, our resource management system is unique in supporting several unmodified parallel programming systems. Furthermore, the system runs with user-level privilege, and thus can not compromise the security of the network.

The underlying mechanism and the overall system have been validated on a dynamically changing mix of jobs, some sequential, some PVM, some MPI, and some Calypso computations. We demonstrate the feasibility and the usefulness of our approach, thus showing how to construct a middleware resource management system to enhance the utilizations of distributed systems.

Abstract: We consider the well-known problem of avoiding unnecessary costly copying that arises in languages with copy/value semantics and large aggregate structures such as arrays, sets, or files. The origins of many recent studies focusing on avoiding copies of flat arrays in functional languages may be traced back to SETL copy optimization [Schwartz 75]. The problem is hard, and progress is slow, but a successful solution is crucial to achieving a pointer-free style of programming envisioned by [Hoare 75].

We give a new solution to copy optimization that uses dynamic reference counts and lazy copying to implement updates efficiently in an imperative language with arbitrarily nested finite sets and maps (which can easily model arrays, records and other aggregate datatypes). Big step operational semantics and abstract interpretations are used to prove the soundness of the analysis and the correctness of the transformation. An efficient algorithm to implement the analysis is presented. The approach is supported by realistic empirical evidence.

Our solution anticipates the introduction of arbitrarily nested polymorphic sets and maps into JAVA. It may also provide a new efficient strategy for implementing object cloning in Java and object assigment in C++. We illustrate how our methods might improve the recent approach of [Wand and Clinger 98] to avoid copies of flat arrays in a language of first-order recursion equations.

Abstract: Enabled by RISC technologies, low-cost commodity microprocessors are performing at ever increasing levels, significantly via instruction level parallelism (ILP). This in turn increases the opportunities for their use in a variety of day-to-day applications ranging from the simple control of appliances such as microwave ovens, to sophisticated systems for cabin control in modern aircraft. Indeed, ``embedded'' applications such as these represent segments in the computer industry with great potential for growth. However, this growth is currently impeded by the lack of robust optimizing compiler technologies that support the assured, rapid and inexpensive prototyping of real-time software in the context of microprocessors with ILP. In this paper we describe a novel notation, TimeC, for specifying timing constraints in programs, {\em independent} of the base language being used to develop the embedded application; TimeC specifications are language independent and can be instrumented into imperative and object-oriented languages non-intrusively. As we will show, the program synthesis problem that arise out of time_tract specifications, a subset of TimeC, are always ``tractable''. In contrast, a range of specification mechanisms proposed earlier yield substantially intractable synthesis questions, thereby limiting their potential utility. We will compare the tractability and related expressive power issues between timeC and some of the extant mechanisms for specifying properties of timed programs.

Keywords: instruction scheduling, compiler optimizations, embedded systems, real-time systems, timing constraints.

Abstract: In this paper, we build a class of overlapping Schwarz preconditioners for a finite element approximation of the Helmholtz equation in two dimensions. Perfectly Matched Layers are employed to build the local problems and two kinds of boundary conditions are employed to match the local solutions. Numerical results are presented to compare the different preconditioners.

Abstract: This article explores the extension of Morgenthaler's Virtual Control Flow technique, which derives control flow semantics directly from the Abstract Syntax Tree, from the relatively coarse granularity of syntactic C expressions to the finer granularity of basic block expressions, that is, expressions without embedded control flow. We explain why this is a better level of abstraction for program analysis, and discuss the elements of an efficient and elegant solution, motivating the presentation by appealing to a more explicit intermediate form. We present our algorithm, and conclude with remarks about the suitability of Morgenthaler's version of Virtual Control Flow for customary exhaustive data-flow analysis.

Abstract: It is previously known that the one dimensional mortar finite element projection is stable in the $L^2$ norm, provided that the ratio of any two neighboring mesh intervals is uniformly bounded, but with the constant in the bound depending on the maximum value of that ratio. In this paper, we show that this projection is stable in the $L^2$ norm, independently of the properties of the nonmortar mesh. The 1D trace of the mortar space considered here is a piecewise polynomial space of arbitrary degree; therefore, our result can be used for both the $h$ and the $hp$ version of the mortar finite element.

Abstract:

Iterative substructuring methods, also known as Schur complement methods, form an important family of domain decomposition algorithms. They are preconditioned conjugate gradient methods where solvers on local subregions and a solver on a coarse mesh are used to construct the preconditioner. For conforming finite element approximations of $H^1$, it is known that the number of conjugate gradient steps required to reduce the residual norm by a fixed factor is independent of the number of substructures and that it grows only as the logarithm of the dimension of the local problem associated with an individual substructure. In this paper, the same result is established for similar iterative methods for low--order N{\'e}d{\'e}lec finite elements, which approximate $\Hcurl$ in two dimensions. Results of numerical experiments are also provided.

Abstract:

The problem of minimizing a sum of Euclidean norms dates from the 17th century and may be the earliest example of duality in the mathematical programming literature. This nonsmooth optimization problem arises in many different kinds of modern scientific applications. We derive a primal-dual interior-point algorithm for the problem, by applying Newton's method directly to a system of nonlinear equations characterizing primal and dual feasibility and a perturbed complementarity condition. The main work at each step consists of solving a system of linear equations (the Schur complement equations). This Schur complement matrix is not symmetric, unlike in linear programming. We incorporate a Mehrotra-type predictor-corrector scheme and present some experimental results comparing several variations of the algorithm, including, as one option, explicit symmetrization of the Schur complement with a skew corrector term. We also present results obtained from a code implemented to solve large sparse problems, using a symmetrized Schur complement. This has been applied to problems arising in plastic collapse analysis, with hundreds of thousands of variables and millions of nonzeros in the constraint matrix. The algorithm typically finds accurate solutions in less than 50 iterations and determines physically meaningful solutions previously unobtainable.

Abstract:

We study several simple models for optical mapping and explore their power and limitations when applied to the construction of maps of clones (e.g., lambdas, cosmids, BACs and YACs). We provide precise lower and upper bounds on the number of clone molecules needed to create the correct map of the clone. Our probabilistic analysis shows that as the number of clone molecules is increased in the optical mapping data, the probability of successful computation of the map jumps from 0 to 1 for fairly small number of molecules (for typical values of the parameterS, the transition point is around 70 molecules). These observations have been independently verified with extensive tests, with both in vitro and in silico data.

In addition, we compare our results with those derived by Karp and Shamir in a recent paper. We hope that this paper clarifies certain misconceptions and explains why the model proposed in Anantharaman et al. (1997) has proven so powerful.

Abstract:

Parallel computing is becoming increasing central and mainstream, driven both by the widespread availability of commodity SMP and high-performance cluster platforms, as well as the growing use of parallelism in general-purpose applications such as image recognition, virtual reality, and media processing. In addition to performance requirements, the latter computations impose soft real-time constraints, necessitating em efficient, predictable parallel resource management. Unfortunately, traditional resource management approaches in both parallel and real-time systems are inadequate for meeting this objective; the parallel approaches focus primarily on improving application performance and/or system utilization at the cost of arbitrarily delaying a given application, while the real-time approaches are overly conservative sacrificing system utilization in order to meet application deadlines. In this paper, we propose a novel approach for increasing parallel system utilization while meeting application soft real-time deadlines. Our approach exploits the application tunability found in several general-purpose computations. Tunability refers to an application's ability to trade off resource requirements over time, while maintaining a desired level of output quality. In other words, a large allocation of resources in one stage of the computation's lifetime may compensate, in a parameterizable manner, for a smaller allocation in another stage. We first describe language extensions to support tunability in the Calypso programming system, a component of the MILAN metacomputing project, and evaluate their expressiveness using an image processing application. We then characterize the performance benefits of tunability, using a synthetic task system to systematically identify its benefits and shortcomings. Our results are very encouraging: application tunability is convenient to express, and can significantly improve parallel system utilization for computations with predictability requirements.

Abstract:

Today there is renewed interest in reliable, rapid, and unintrusive means for automatically recognizing the identity of individuals. The key technologies behind some of the current cost-effective, reliable, and highly accurate identity based systems rely on biometrics. New developments in micro-electronics and advances in pattern matching, neural networks, and dynamic programming are enabling biometric technologies to perform more accurately at continually lower costs. In recent years, biometric technology has attained a very high degree of sophistication, and accuracy has been achieved at levels which surpass many other forms of identification. In this paper, we survey the theoretical foundations and design methodologies behind some practical identification technologies being deployed today.

Abstract:

The FETI method is an iterative substructuring method using Lagrange multipliers. It is actively used in industrial--size parallel codes for solving difficult computational mechanics problems, for example the system ANSYS. Mortar finite elements are nonconforming finite elements that also allow for a geometrically nonconforming decomposition of the computational domain and for the optimal coupling of different variational approximations in different subdomains. We present a numerical study of three different FETI preconditioners for two dimensional, self-adjoint, elliptic equations discretized by mortar finite elements.

Abstract:

Mortar finite elements are nonconforming finite elements that allow for a geometrically nonconforming decomposition of the computational domain and, at the same time, for the optimal coupling of different variational approximations in different subregions. Poincare and Friedrichs inequalities for mortar finite elements are derived. Using these inequalities, it is shown that the condition number for self-adjoint elliptic problems discretized using mortars is comparable to that of the conforming finite element case. Geometrically non-conforming mortars of the second generation are considered, i.e. no continuity conditions are imposed at the vertices of the subregions.

Abstract:

The iterative substructuring methods, also known as Schur complement methods, form one of two important families of domain decomposition algorithms. They are based on a partitioning of a given region, on which the partial differential equation is defined, into non-overlapping substructures. The preconditioners of these conjugate gradient methods are then defined in terms of local problems defined on individual substructures and pairs of substructures, and, in addition, a global problem of low dimension. An iterative method of this kind is introduced for the lowest order Raviart-Thomas finite elements in three dimensions and it is shown that the condition number of the relevant operator is independent of the number of substructures and grows only as the square of the logarithm of the number of unknowns associated with an individual substructure. The theoretical bounds are confirmed by a series of numerical experiments.

Abstract:

The long-time quest of the parallelizing compiler community for effective aggregate summarization techniques has led to increasingly sophisticated array section representations. In this paper, we show how the latest of these can be used for nested C structure summarization. We then show how this summarization notation can be used to make Shape Analysis precise on arbitrarily low-level code. Combining these techniques, we show that an appropriate generalization of Memory Classification Analysis, originally presented for Fortran programs, provides a flow dependence summarization technique for C code as well, while avoiding code normalization compared with previous techniques. In so doing, we break down perhaps the final conceptual barriers in the construction of practical programmer-friendly C parallelizing compilers.

Abstract:

Hind et al.~\cite({Hind99}) use a standard data flow framework \cite{Rosen79, Tarjan81} to formulate an intra-procedural may-alias computation. The intra-procedural aliasing information is computed by applying well-known iterative techniques to the Sparse Evaluation Graph (SEG) (\cite{Choi91}). The computation requires a transfer function for each node that causes a potential pointer assignment (relating the data flow information flowing into and out of the node), and a set of aliases holding at the entry node of the SEG. The intra-procedural analysis assumes that precomputed information in the form of summary functions is available for all function-call sites in the procedure being analyzed. The time complexity of the intra-procedural may-alias computation for the algorithm presented by Hind et al.~(\cite{Hind99}) is $O(N^6)$ in the worst case (where $N$ is the size of the SEG). In this paper we present a worst case $O(N^3)$ time algorithm to compute the same may-alias information.

Abstract:

This paper presents visualizations of binary search trees and splay trees. The visualizations comprise sequences of figures or frames, called comic strips. Consecutive frames are viewed two at a time to facilitate user (viewer) understanding of the algorithm steps. The visualizations are implemented in Java to facilitate their wide use. This paper explores several other considerations in the design of instructional visualizations.

Abstract:

This article proposes a novel architectural model which augments the latest developments in automatic program parallelization and distributed systems to achieve a level of practicality as yet unknown to either field. Today's premier automatic parallelization model is well suited to implementation on a network of commodity workstations (NOW) using only a very thin layer of software support. We describe a parallelizing compiler framework which greatly simplifies the parallelization of even highly complex sequential applications while producing extremely effective parallelizations for the NOW. We further show how our model greatly enhances programmer productivity through the use of minimally invasive C++ transformation techniques, aiding both debugging and portability.

Abstract:

A new domain decomposition method with Lagrange multipliers for elliptic problems is introduced. It is based on a reformulation of the well--known FETI method as a saddle point problem with both primal and dual variables as unknowns. The resulting linear system is solved with block--structured preconditioners combined with a suitable Krylov subspace method. This approach allows the use of inexact subdomain solvers for the positive definite subproblems. It is shown that the condition number of the preconditioned saddle point problem is bounded independently of the number of subregions and depends only polylogarithmically on the number of degrees of freedom of individual local subproblems. Numerical results are presented for a plane stress cantilever membrane problem.

Abstract:

In this paper we introduce improved rules for Catmull-Clark and Loop subdivision that overcome several problems with the original schemes (lack of smoothness at extraordinary boundary vertices, folds near concave corners). In addition, our approach to rule modification allows generation of surfaces with prescribed normals, both on the boundary and in the interior, which considerably improves control of the shape of surfaces.

Abstract:

This article presents a novel Sparse Constant Propagation technique which provides a heretofore unknown level of practicality. Unlike other techniques which are based on data flow, it is based on the execution-order summarization sweep employed in Memory Classification Analysis (MCA), a technique originally developed for array dependence analysis. This methodology achieves a precise description of memory reference activity within a summary representation that grows only linearly with program size. Because of this, the collected sparse constant information need not be artificially limited to satisfy classical data flow lattice requirements, which constrain other algorithms to discard information in the interests of efficient termination. Sparse Constant Propagation is not only more effective within the MCA framework, but it in fact generalizes the framework. Original MCA provids the means to break only simple induction and reduction types of flow-dependences. The integrated framework provides the means to also break flow-dependences for which array values can be propagated.

Abstract:

We propose an interactive framework for reconstructing an arbitrary 3D scene consistent with a set of images, for use in example-based image synthesis. Previous research has used human input to specify feature matches, which are then processed off-line; however, it is very difficult to correctly match images without feedback. The central idea of this paper is to perform and display 3D reconstruction during user modification. By allowing the user to interactively manipulate the image correspondence and the resulting 3D reconstruction, we can exploit both the user's intuitive image understanding and the computer's processing power.

Abstract:

We present a set of very low bandwidth techniques for navigating remote environments. In a typical setup using our system, a virtual environment resides on a server machine, and one or more users explore the environment from client machines. Each client uses previous views of the environment to predict the next view, using the known camera motion and image-based rendering techniques. The server performs the same prediction, and sends only the difference between the predicted and actual view. Compressed difference images require significantly less bandwidth than the compressed images of each frame, and thus can yield much higher frame rates. To request a view, the client simply sends the coordinates of the desired view and of the previous view to the server. This avoids the overhead of maintaining connections between the server and each client.

No restrictions are placed on the scene or the camera motions; the view compression technique may be used with arbitrarily complex 3D scenes or dynamically changing views from a web camera or a digital television broadcast. A lossy compression scheme is presented in which the client estimates the cumulative error in each frame, and requests a comprete refresh before errors become noticable.

This work is applicable to remote exploration of virtual worlds such as on head-mounted displays, Digital Television, or over the Internet.

Abstract:

Finite element approximation of vector equations gives rise to very large, sparse linear systems. In this dissertation, we study some domain decomposition methods for finite element approximations of vector--valued problems, involving the curl and the divergence operators. Edge and Raviart--Thomas finite element are employed. Problems involving the curl operator arise, for instance, when approximating Maxwell's equations and the stream function--vorticity formulation of Stokes' problem, while mixed approximations of second order elliptic equations and stabilized mixed formulations of the Stoke' problem give rise to problems involving the divergence operator.

We first consider Maxwell's equations in three dimensional conductive media using implicit time--stepping. We prove that the condition number of a two-level overlapping algorithm is bounded independently of the number of unknowns, the number of subregions, and the time step.

For the same equation in two dimensions, we consider two new iterative substructuring methods. The first one is based on individual edges, while the second one is a Neumann-Neumann method. We show that the condition numbers of the corresponding methods increase slowly with the number of unknowns in each substructure, but are independent of the time step and even large jumps of the coefficients. We also analyze similar preconditioners for a three--dimensional vector problem involving the divergence operator, and prove that the preconditioners are quasi--optimal and scalable in this case as well.

For each method, we provide a series of numerical experiments, that confirm our theoretical analysis.

This work generalizes well--known results for scalar second order elliptic equations and has required the development of several new technical tools.

Abstract:

In this paper, we study some Schwarz methods of Neumann-Neumann type for some vector field problems, discretized with the lowest order Raviart-Thomas and Nedelec finite elements. We consider a hybrid Schwarz peconditioner consisting of a coarse component, which involves the solution of the original problem on a coarse mesh, and local ones, which involve the solution of Neumann problems on the elements of the coarse triangulation, also called substructures. We show that the condition number of the corresponding method is independent of the number of substructures and grows logarithmically with the number of unknowns associated with an individual substructure. It is also independent of the jumps of both the coefficients of the original problem. The numerical results presented validate our theoretical bound.

Abstract:

This paper addresses the problem of recovering 3D non-rigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full head and its internal modes of variation. Many solutions that recover 3D shape from 2D image sequences have been proposed; these so-called structure-from-motion techniques usually assume that the 3D object is rigid. For example, Tomasi and Kanade's factorization technique is based on a rigid shape matrix, which produces a tracking matrix of rank 3 under orthographic projection. We propose a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes. Under this model, the tracking matrix is of higher rank, and can be factored in a three step process to yield to pose, configuration and shape. We demonstrate this simple but effective algorithm on video sequences of speaking people. We were able to recover 3D non-rigid facial models with high accuracy.

Abstract:

A class of FETI methods for the edge element approximation of vector field problems in two dimensions is introduced and analyzed. First, an abstract framework is presented for the analysis of a class of FETI methods where a natural coarse problem, associated with the substructures, is lacking. Then, a family of FETI methods for edge element approximations is proposed. It is shown that the condition number of the corresponding method is independent of the number of substructures and grows only polylogarithmically with the number of unknowns associated with individual substructures. The estimate is also independent of the jumps of both of the coefficients of the original problem. Numerical results validating the theoretical bounds are given. The method and its analysis can be easily generalized to Raviart-Thomas element approximations in two and three dimensions.

Abstract:

We give a randomized algorithm for the {\em Pattern Matching with Swaps} problem which runs in $O(m \log m \log |\Sigma| )$ time on a text of length $2m-1$ and a pattern of length $m$ drawn from an alphabet set of size $|\Sigma|$. This algorithm gives the correct answer with probability at least $1-\frac{1}{m}$ and does not miss a match. The best deterministic algorithm known for this problem takes $O(m^{4/3} \mbox{polylog}(m))$ time.

Abstract:

We consider spectral functions f o lambda, where f is any permutation-invariant mapping from C^n to R, and lambda is the eigenvalue map from C^{n X n} to C^n, ordering the eigenvalues lexicographically. For example, if f is the function "maximum real part", then f o lambda is the spectral abscissa, while if f is "maximum modulus", then f o lambda is the spectral radius. Both these spectral functions are continuous, but they are neither convex nor Lipschitz. For our analysis, we use the notion of subgradient extensively analyzed in Variational Analysis, R.T. Rockafellar and R. J.-B. Wets (Springer, 1998), which is particularly well suited to the variational analysis of non-Lipschitz spectral functions. We derive a number of necessary conditions for subgradients of spectral functions. For the spectral abscissa, we give both necessary and sufficient conditions for subgradients, and precisely identify the case where subdifferential regularity holds. We conclude by introducing the notion of semistable programming: minimizing a linear function of a matrix subject to linear constraints, together with the constraint that the eigenvalues of the matrix all lie in the right half-plane or on the imaginary axis. This is a generalization of semidefinite programming for non-Hermitian matrices. Using our analysis, we derive a necessary condition for a local minimizer of a semistable program, and give a generalization of the complementarity condition familiar from semidefinite programming.

Abstract:

Given an affine subspace of square matrices, we consider the problem of minimizing the spectral abscissa (the largest real part of an eigenvalue). We give an example whose optimal solution has Jordan form consisting of a single Jordan block, and we show, using non-lipschitz variational analysis, that this behaviour persists under arbitrary small perturbations to the example. Thus although matrices with nontrivial Jordan structure are rare in the space of all matrices, they appear naturally in spectral abscissa minimization.

Abstract:

Let $V$, $E$, and $D$ denote the cardinality of the vertex set, the cardinality of the edge set, and the maximum degree of a bipartite multigraph $G$. We show that a minimal edge-coloring of $G$ can be computed in $O(E\log D)$ time.

Abstract:

Current technology trends point towards both an increased heterogeneity in hardware platforms and an increase in the mechanisms available to applications for controlling how these platforms are utilized. These trends motivate the design of resource-aware distributed applications, which proactively monitor and control utilization of the underlying platform, ensuring a desired performance level by adapting their behavior to changing resource characteristics.

This paper describes a general framework for enabling application adaptation on distributed platforms. The framework combines programmer specification of alternate execution behaviors (configurations) with automatic support for deciding when and how to adapt, relying extensively on two components: (1) profile-based modeling of application behavior, automatically generated by measuring application performance in a virtual execution environment with controllable resource consumption, and (2)application-specific continuous monitoring of current resource characteristics. The latter detects when application configurations need to change while the former guides the selection of a new configuration.

We evaluate these framework components using an interactive image visualization application. Our results demonstrate that starting from a natural specification of alternate application behaviors and an automatically generated performance database, our framework permits the application to both configure itself in diverse distributed environments and adapt itself to run-time changes in resource characteristics so as to satisfy user preferences of output quality.

Abstract:

mprovements in microprocessor and networking performance have made networks of workstations a very attractive platform for high-end parallel and distributed computing. However, the effective deployment of such environments requires addressing two problems not associated with dedicated parallel machines: heterogeneous resource capabilities and dynamic availability. Achieving good performance requires that application components be able to migrate between cluster resources and efficiently adapt to the underlying resource capabilities. An important component of the required support is maintaining network connectivity, which directly impacts on the transparency of migration to the application and its performance after migration. Unfortunately, existing approaches rely on either extensive operating system modifications or new APIs to maintain network connectivity, both of which limits their wider applicability.

This paper presents the design, implementation, and performance of a transparent network connectivity layer for dynamic cluster environments. Our design uses the techniques of API interception and virtualization to construct a transparent layer in user space; use of the layer requires no modification either to the application or the underlying operating system and messaging layers. Our layer enables the migration of application components without breaking network connections, and additionally permits adaptation to the characteristics of the underlying networking substrate. Experiments with supporting a persistent socket interface in two environments---an Ethernet LAN on top of TCP/IP, and a Myrinet LAN on top of Fast Messages---show that our approach incurs minimal overheads and can effectively select the best substrate for implementing application communication requirements.