We derive normal approximation results for a class of stabilizing functionals of binomial or Poisson point process, that are not necessarily expressible as sums of certain score functions. Our approach is based on a flexible notion of the add-one cost operator, which helps one to deal with the second-order cost operator via suitably appropriate first-order operators. We combine this flexible notion with the theory of strong stabilization to establish our results. We illustrate the applicability of our results by establishing normal approximation results for certain geometric and topological statistics arising frequently in practice. Several existing results also emerge as special cases of our approach.
Most common Optimal Transport (OT) solvers are currently based on an approximation of underlying measures by discrete measures. However, it is sometimes relevant to work only with moments of measures instead of the measure itself, and many common OT problems can be formulated as moment problems (the most relevant examples being $L^p$-Wasserstein distances, barycenters, and Gromov-Wasserstein discrepancies on Euclidean spaces). We leverage this fact to develop a generalized moment formulation that covers these classes of OT problems. The transport plan is represented through its moments on a given basis, and the marginal constraints are expressed in terms of moment constraints. A practical computation then consists in considering a truncation of the involved moment sequences up to a certain order, and using the polynomial sums-of-squares hierarchy for measures supported on semi-algebraic sets. We prove that the strategy converges to the solution of the OT problem as the order increases. We also show how to approximate linear quantities of interest, and how to estimate the support of the optimal transport map from the computed moments using Christoffel-Darboux kernels. Numerical experiments illustrate the good behavior of the approach.
This paper presents an immersed, isogeometric finite element framework to predict the response of multi-material, multi-physics problems with complex geometries using locally refined discretizations. To circumvent the need to generate conformal meshes, this work uses an eXtended Finite Element Method (XFEM) to discretize the governing equations on non-conforming, embedding meshes. A flexible approach to create truncated hierarchical B-splines discretizations is presented. This approach enables the refinement of each state variable field individually to meet field-specific accuracy requirements. To obtain an immersed geometry representation that is consistent across all hierarchically refined B-spline discretizations, the geometry is immersed into a single mesh, the XFEM background mesh, which is constructed from the union of all hierarchical B-spline meshes. An extraction operator is introduced to represent the truncated hierarchical B-spline bases in terms of Lagrange shape functions on the XFEM background mesh without loss of accuracy. The truncated hierarchical B-spline bases are enriched using a generalized Heaviside enrichment strategy to accommodate small geometric features and multi-material problems. The governing equations are augmented by a formulation of the face-oriented ghost stabilization enhanced for locally refined B-spline bases. We present examples for two- and three-dimensional linear elastic and thermo-elastic problems. The numerical results validate the accuracy of our framework. The results also demonstrate the applicability of the proposed framework to large, geometrically complex problems.
Many applications, such as system identification, classification of time series, direct and inverse problems in partial differential equations, and uncertainty quantification lead to the question of approximation of a non-linear operator between metric spaces $\mathfrak{X}$ and $\mathfrak{Y}$. We study the problem of determining the degree of approximation of such operators on a compact subset $K_\mathfrak{X}\subset \mathfrak{X}$ using a finite amount of information. If $\mathcal{F}: K_\mathfrak{X}\to K_\mathfrak{Y}$, a well established strategy to approximate $\mathcal{F}(F)$ for some $F\in K_\mathfrak{X}$ is to encode $F$ (respectively, $\mathcal{F}(F)$) in terms of a finite number $d$ (repectively $m$) of real numbers. Together with appropriate reconstruction algorithms (decoders), the problem reduces to the approximation of $m$ functions on a compact subset of a high dimensional Euclidean space $\mathbb{R}^d$, equivalently, the unit sphere $\mathbb{S}^d$ embedded in $\mathbb{R}^{d+1}$. The problem is challenging because $d$, $m$, as well as the complexity of the approximation on $\mathbb{S}^d$ are all large, and it is necessary to estimate the accuracy keeping track of the inter-dependence of all the approximations involved. In this paper, we establish constructive methods to do this efficiently; i.e., with the constants involved in the estimates on the approximation on $\mathbb{S}^d$ being $\mathcal{O}(d^{1/6})$. We study different smoothness classes for the operators, and also propose a method for approximation of $\mathcal{F}(F)$ using only information in a small neighborhood of $F$, resulting in an effective reduction in the number of parameters involved.
Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set $X$ of $n$ points and two integers $k$ and $m$, the clustering with outliers aims to exclude $m$ points from $X$ and partition the remaining points into $k$ clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in $k$ and $m$, that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for $k$-Median and $k$-Means with outliers in general metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.
The performance of individual evolutionary optimization algorithms is mostly measured in terms of statistics such as mean, median and standard deviation etc., computed over the best solutions obtained with few trails of the algorithm. To compare the performance of two algorithms, the values of these statistics are compared instead of comparing the solutions directly. This kind of comparison lacks direct comparison of solutions obtained with different algorithms. For instance, the comparison of best solutions (or worst solution) of two algorithms simply not possible. Moreover, ranking of algorithms is mostly done in terms of solution quality only, despite the fact that the convergence of algorithm is also an important factor. In this paper, a direct comparison approach is proposed to analyze the performance of evolutionary optimization algorithms. A direct comparison matrix called \emph{Prasatul Matrix} is prepared, which accounts direct comparison outcome of best solutions obtained with two algorithms for a specific number of trials. Five different performance measures are designed based on the prasatul matrix to evaluate the performance of algorithms in terms of Optimality and Comparability of solutions. These scores are utilized to develop a score-driven approach for comparing performance of multiple algorithms as well as for ranking both in the grounds of solution quality and convergence analysis. Proposed approach is analyzed with six evolutionary optimization algorithms on 25 benchmark functions. A non-parametric statistical analysis, namely Wilcoxon paired sum-rank test is also performed to verify the outcomes of proposed direct comparison approach.
We propose a preconditioner for the Helmholtz exterior problems on multi-screens. For this, we combine quotient-space BEM and operator preconditioning. For a class of multi-screens (which we dub \emph{type A} multi-screens), we show that this approach leads to block diagonal Calder\'on preconditioners and results in a spectral condition number that grows only logarithmically with $h$, just as in the case of simple screens. Since the resulting scheme contains many more DoFs than strictly required, we also present strategies to remove almost all redundancy without significant loss of effectiveness of the preconditioner. We verify these findings by providing representative numerical results. Further numerical experiments suggest that these results can be extended beyond type A multi-screens and that the numerical method introduced here can be applied to essentially all multi-screens encountered by the practitioner, leading to a significantly reduced simulation cost.
We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we achieve an improvement of about $9\%$ accuracy and $5\%$ ERR on four common datasets; on CREMA-D, the proposed feature set reaches a new state of the art performance with accuracy $80.155$. We also show that topological features are able to reveal functional roles of speech Transformer heads; e.g., we find the heads capable to distinguish between pairs of sample sources (natural/synthetic) or voices without any downstream fine-tuning. Our results demonstrate that TDA is a promising new approach for speech analysis, especially for tasks that require structural prediction.
We introduce an integral representation of the Monge-Amp\`ere equation, which leads to a new finite difference method based upon numerical quadrature. The resulting scheme is monotone and fits immediately into existing convergence proofs for the Monge-Amp\`ere equation with either Dirichlet or optimal transport boundary conditions. The use of higher-order quadrature schemes allows for substantial reduction in the component of the error that depends on the angular resolution of the finite difference stencil. This, in turn, allows for significant improvements in both stencil width and formal truncation error. The resulting schemes can achieve a formal accuracy that is arbitrarily close to $\mathcal{O}(h^2)$, which is the optimal consistency order for monotone approximations of second order operators. We present three different implementations of this method. The first two exploit the spectral accuracy of the trapezoid rule on uniform angular discretizations to allow for computation on a nearest-neighbors finite difference stencil over a large range of grid refinements. The third uses higher-order quadrature to produce superlinear convergence while simultaneously utilizing narrower stencils than other monotone methods. Computational results are presented in two dimensions for problems of various regularity.
In many unmanned aerial vehicle (UAV) applications for surveillance and data collection, it is not possible to reach all requested locations due to the given maximum flight time. Hence, the requested locations must be prioritized and the problem of selecting the most important locations is modeled as an Orienteering Problem (OP). To fully exploit the kinematic properties of the UAV in such scenarios, we combine the OP with the generation of time-optimal trajectories with bounds on velocity and acceleration. We define the resulting problem as the Kinematic Orienteering Problem (KOP) and propose an exact mixed-integer formulation together with a Large Neighborhood Search (LNS) as a heuristic solution method. We demonstrate the effectiveness of our approach based on Orienteering instances from the literature and benchmark against optimal solutions of the Dubins Orienteering Problem (DOP) as the state-of-the-art. Additionally, we show by simulation \color{black} that the resulting solutions can be tracked precisely by a modern MPC-based flight controller. Since we demonstrate that the state-of-the-art in generating time-optimal trajectories in multiple dimensions is not generally correct, we further present an improved analytical method for time-optimal trajectory generation.
Understanding of the pathophysiology of obstructive lung disease (OLD) is limited by available methods to examine the relationship between multi-omic molecular phenomena and clinical outcomes. Integrative factorization methods for multi-omic data can reveal latent patterns of variation describing important biological signal. However, most methods do not provide a framework for inference on the estimated factorization, simultaneously predict important disease phenotypes or clinical outcomes, nor accommodate multiple imputation. To address these gaps, we propose Bayesian Simultaneous Factorization (BSF). We use conjugate normal priors and show that the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. We then extend BSF to simultaneously predict a continuous or binary response, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation and full posterior inference for missing data, including "blockwise" missingness, and BSFP offers prediction of unobserved outcomes. We show via simulation that BSFP is competitive in recovering latent variation structure, as well as the importance of propagating uncertainty from the estimated factorization to prediction. We also study the imputation performance of BSF via simulation under missing-at-random and missing-not-at-random assumptions. Lastly, we use BSFP to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated OLD. Our analysis reveals a distinct cluster of patients with OLD driven by shared metabolomic and proteomic expression patterns, as well as multi-omic patterns related to lung function decline. Software is freely available at //github.com/sarahsamorodnitsky/BSFP .