Lawvere showed that generalised metric spaces are categories enriched over $[0, \infty]$, the quantale of the positive extended reals. The statement of enrichment is a quantitative analogue of being a preorder. Towards seeking a logic for quantitative metric reasoning, we investigate three $[0,\infty]$-valued propositional logics over the Lawvere quantale. The basic logical connectives shared by all three logics are those that can be interpreted in any quantale, viz finite conjunctions and disjunctions, tensor (addition for the Lawvere quantale) and linear implication (here a truncated subtraction); to these we add, in turn, the constant $1$ to express integer values, and scalar multiplication by a non-negative real to express general affine combinations. Quantitative equational logic can be interpreted in the third logic if we allow inference systems instead of axiomatic systems. For each of these logics we develop a natural deduction system which we prove to be decidably complete w.r.t. the quantale-valued semantics. The heart of the completeness proof makes use of the Motzkin transposition theorem. Consistency is also decidable; the proof makes use of Fourier-Motzkin elimination of linear inequalities. Strong completeness does not hold in general, even (as is known) for theories over finitely-many propositional variables; indeed even an approximate form of strong completeness in the sense of Pavelka or Ben Yaacov -- provability up to arbitrary precision -- does not hold. However, we can show it for theories axiomatized by a (not necessarily finite) set of judgements in normal form over a finite set of propositional variables when we restrict to models that do not map variables to $\infty$; the proof uses Hurwicz's general form of the Farkas' Lemma.
Expected Improvement (EI) is arguably the most popular acquisition function in Bayesian optimization and has found countless successful applications, but its performance is often exceeded by that of more recent methods. Notably, EI and its variants, including for the parallel and multi-objective settings, are challenging to optimize because their acquisition values vanish numerically in many regions. This difficulty generally increases as the number of observations, dimensionality of the search space, or the number of constraints grow, resulting in performance that is inconsistent across the literature and most often sub-optimal. Herein, we propose LogEI, a new family of acquisition functions whose members either have identical or approximately equal optima as their canonical counterparts, but are substantially easier to optimize numerically. We demonstrate that numerical pathologies manifest themselves in "classic" analytic EI, Expected Hypervolume Improvement (EHVI), as well as their constrained, noisy, and parallel variants, and propose corresponding reformulations that remedy these pathologies. Our empirical results show that members of the LogEI family of acquisition functions substantially improve on the optimization performance of their canonical counterparts and surprisingly, are on par with or exceed the performance of recent state-of-the-art acquisition functions, highlighting the understated role of numerical optimization in the literature.
Discrete particle simulations have become the standard in science and industrial applications exploring the properties of particulate systems. Most of such simulations rely on the concept of interacting spherical particles to describe the properties of particulates, although, the correct representation of the nonspherical particle shape is crucial for a number of applications. In this work we describe the implementation of clumps, i.e. assemblies of rigidly connected spherical particles, which can approximate given nonspherical shapes, within the \textit{MercuryDPM} particle dynamics code. \textit{MercuryDPM} contact detection algorithm is particularly efficient for polydisperse particle systems, which is essential for multilevel clumps approximating complex surfaces. We employ the existing open-source \texttt{CLUMP} library to generate clump particles. We detail the pre-processing tools providing necessary initial data, as well as the necessary adjustments of the algorithms of contact detection, collision/migration and numerical time integration. The capabilities of our implementation are illustrated for a variety of examples.
Record linkage is the task of combining records from multiple files which refer to overlapping sets of entities when there is no unique identifying field. In streaming record linkage, files arrive sequentially in time and estimates of links are updated after the arrival of each file. This problem arises in settings such as longitudinal surveys, electronic health records, and online events databases, among others. The challenge in streaming record linkage is to efficiently update parameter estimates as new data arrive. We approach the problem from a Bayesian perspective with estimates calculated from posterior samples of parameters and present methods for updating link estimates after the arrival of a new file that are faster than fitting a joint model with each new data file. In this paper, we generalize a two-file Bayesian Fellegi-Sunter model to the multi-file case and propose two methods to perform streaming updates. We examine the effect of prior distribution on the resulting linkage accuracy as well as the computational trade-offs between the methods when compared to a Gibbs sampler through simulated and real-world survey panel data. We achieve near-equivalent posterior inference at a small fraction of the compute time. Supplemental materials for this article are available online.
Many successful methods to learn dynamical systems from data have recently been introduced. However, ensuring that the inferred dynamics preserve known constraints, such as conservation laws or restrictions on the allowed system states, remains challenging. We propose stabilized neural differential equations (SNDEs), a method to enforce arbitrary manifold constraints for neural differential equations. Our approach is based on a stabilization term that, when added to the original dynamics, renders the constraint manifold provably asymptotically stable. Due to its simplicity, our method is compatible with all common neural differential equation (NDE) models and broadly applicable. In extensive empirical evaluations, we demonstrate that SNDEs outperform existing methods while broadening the types of constraints that can be incorporated into NDE training.
We consider the classical Shiryaev--Roberts martingale diffusion, $(R_t)_{t\ge0}$, restricted to the interval $[0,A]$, where $A>0$ is a preset absorbing boundary. We take yet another look at the well-known phenomenon of quasi-stationarity (time-invariant probabilistic behavior, conditional on no absorbtion hitherto) exhibited by the diffusion in the temporal limit, as $t\to+\infty$, for each $A>0$. We obtain new upper- and lower-bounds for the quasi-stationary distribution's probability density function (pdf), $q_{A}(x)$; the bounds vary in the trade-off between simplicity and tightness. The bounds imply directly the expected result that $q_{A}(x)$ converges to the pdf, $h(x)$, of the diffusion's stationary distribution, as $A\to+\infty$; the convergence is pointwise, for all $x\ge0$. The bounds also yield an explicit upperbound for the gap between $q_{A}(x)$ and $h(x)$ for a fixed $x$. By virtue of integration the bounds for the pdf $q_{A}(x)$ translate into new bounds for the corresponding cumulative distribution function (cdf), $Q_{A}(x)$. All of our results are established explicitly, using certain latest monotonicity properties of the modified Bessel $K$ function involved in the exact closed-form formula for $q_{A}(x)$ recently obtained by Polunchenko (2017). We conclude with a discussion of potential applications of our results in quickest change-point detection: our bounds allow for a very accurate performance analysis of the so-called randomized Shiryaev--Roberts--Pollak change-point detection procedure.
Clans are representations of generalized algebraic theories that contain more information than the finite-limit categories associated to the locally finitely presentable categories of models via Gabriel-Ulmer duality. Extending Gabriel-Ulmer duality to account for this additional information, we present a duality theory between clans and locally finitely presentable categories equipped with a weak factorization system of a certain kind.
Fr\'echet regression has received considerable attention to model metric-space valued responses that are complex and non-Euclidean data, such as probability distributions and vectors on the unit sphere. However, existing Fr\'echet regression literature focuses on the classical setting where the predictor dimension is fixed, and the sample size goes to infinity. This paper proposes sparse Fr\'echet sufficient dimension reduction with graphical structure among high-dimensional Euclidean predictors. In particular, we propose a convex optimization problem that leverages the graphical information among predictors and avoids inverting the high-dimensional covariance matrix. We also provide the Alternating Direction Method of Multipliers (ADMM) algorithm to solve the optimization problem. Theoretically, the proposed method achieves subspace estimation and variable selection consistency under suitable conditions. Extensive simulations and a real data analysis are carried out to illustrate the finite-sample performance of the proposed method.
In the theory of lossy compression, the rate-distortion (R-D) function $R(D)$ describes how much a data source can be compressed (in bit-rate) at any given level of fidelity (distortion). Obtaining $R(D)$ for a given data source establishes the fundamental performance limit for all compression algorithms. We propose a new method to estimate $R(D)$ from the perspective of optimal transport. Unlike the classic Blahut--Arimoto algorithm which fixes the support of the reproduction distribution in advance, our Wasserstein gradient descent algorithm learns the support of the optimal reproduction distribution by moving particles. We prove its local convergence and analyze the sample complexity of our R-D estimator based on a connection to entropic optimal transport. Experimentally, we obtain comparable or tighter bounds than state-of-the-art neural network methods on low-rate sources while requiring considerably less tuning and computation effort. We also highlight a connection to maximum-likelihood deconvolution and introduce a new class of sources that can be used as test cases with known solutions to the R-D problem.
Sharpness-Aware Minimization (SAM) is an optimizer that takes a descent step based on the gradient at a perturbation $y_t = x_t + \rho \frac{\nabla f(x_t)}{\lVert \nabla f(x_t) \rVert}$ of the current point $x_t$. Existing studies prove convergence of SAM for smooth functions, but they do so by assuming decaying perturbation size $\rho$ and/or no gradient normalization in $y_t$, which is detached from practice. To address this gap, we study deterministic/stochastic versions of SAM with practical configurations (i.e., constant $\rho$ and gradient normalization in $y_t$) and explore their convergence properties on smooth functions with (non)convexity assumptions. Perhaps surprisingly, in many scenarios, we find out that SAM has limited capability to converge to global minima or stationary points. For smooth strongly convex functions, we show that while deterministic SAM enjoys tight global convergence rates of $\tilde \Theta(\frac{1}{T^2})$, the convergence bound of stochastic SAM suffers an inevitable additive term $O(\rho^2)$, indicating convergence only up to neighborhoods of optima. In fact, such $O(\rho^2)$ factors arise for stochastic SAM in all the settings we consider, and also for deterministic SAM in nonconvex cases; importantly, we prove by examples that such terms are unavoidable. Our results highlight vastly different characteristics of SAM with vs. without decaying perturbation size or gradient normalization, and suggest that the intuitions gained from one version may not apply to the other.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.