The Fr\'echet mean is an important statistical summary and measure of centrality of data; it has been defined and studied for persistent homology captured by persistence diagrams. However, the complicated geometry of the space of persistence diagrams implies that the Fr\'echet mean for a given set of persistence diagrams is not necessarily unique, which prohibits theoretical guarantees for empirical means with respect to population means. In this paper, we derive a variance expression for a set of persistence diagrams exhibiting a multi-matching between the persistence points known as a grouping. Moreover, we propose a condition for groupings, which we refer to as flatness: sets of persistence diagrams that exhibit flat groupings give rise to unique Fr\'echet means. We derive a finite sample convergence result for general groupings, which results in convergence for Fr\'echet means if the groupings are flat. Finally, we interpret flat groupings in a recently-proposed general framework of Fr\'echet means in Alexandrov geometry. Together with recent results from Alexandrov geometry, this allows for the first derivation of a finite sample convergence rate for sets of persistence diagrams and lays the ground for viability of the Fr\'echet mean as a practical statistical summary of persistent homology.
The entropy production rate is a central quantity in non-equilibrium statistical physics, scoring how far a stochastic process is from being time-reversible. In this paper, we compute the entropy production of diffusion processes at non-equilibrium steady-state under the condition that the time-reversal of the diffusion remains a diffusion. We start by characterising the entropy production of both discrete and continuous-time Markov processes. We investigate the time-reversal of time-homogeneous stationary diffusions and recall the most general conditions for the reversibility of the diffusion property, which includes hypoelliptic and degenerate diffusions, and locally Lipschitz vector fields. We decompose the drift into its time-reversible and irreversible parts, or equivalently, the generator into symmetric and antisymmetric operators. We show the equivalence with a decomposition of the backward Kolmogorov equation considered in hypocoercivity theory, and a decomposition of the Fokker-Planck equation in GENERIC form. The main result shows that when the time-irreversible part of the drift is in the range of the volatility matrix (almost everywhere) the forward and time-reversed path space measures of the process are mutually equivalent, and evaluates the entropy production. When this does not hold, the measures are mutually singular and the entropy production is infinite. We verify these results using exact numerical simulations of linear diffusions. We illustrate the discrepancy between the entropy production of non-linear diffusions and their numerical simulations in several examples and illustrate how the entropy production can be used for accurate numerical simulation. Finally, we discuss the relationship between time-irreversibility and sampling efficiency, and how we can modify the definition of entropy production to score how far a process is from being generalised reversible.
If machine learning models were to achieve superhuman abilities at various reasoning or decision-making tasks, how would we go about evaluating such models, given that humans would necessarily be poor proxies for ground truth? In this paper, we propose a framework for evaluating superhuman models via consistency checks. Our premise is that while the correctness of superhuman decisions may be impossible to evaluate, we can still surface mistakes if the model's decisions fail to satisfy certain logical, human-interpretable rules. We instantiate our framework on three tasks where correctness of decisions is hard to evaluate due to either superhuman model abilities, or to otherwise missing ground truth: evaluating chess positions, forecasting future events, and making legal judgments. We show that regardless of a model's (possibly superhuman) performance on these tasks, we can discover logical inconsistencies in decision making. For example: a chess engine assigning opposing valuations to semantically identical boards; GPT-4 forecasting that sports records will evolve non-monotonically over time; or an AI judge assigning bail to a defendant only after we add a felony to their criminal record.
In this study, we examine numerical approximations for 2nd-order linear-nonlinear differential equations with diverse boundary conditions, followed by the residual corrections of the first approximations. We first obtain numerical results using the Galerkin weighted residual approach with Bernstein polynomials. The generation of residuals is brought on by the fact that our first approximation is computed using numerical methods. To minimize these residuals, we use the compact finite difference scheme of 4th-order convergence to solve the error differential equations in accordance with the error boundary conditions. We also introduce the formulation of the compact finite difference method of fourth-order convergence for the nonlinear BVPs. The improved approximations are produced by adding the error values derived from the approximations of the error differential equation to the weighted residual values. Numerical results are compared to the exact solutions and to the solutions available in the published literature to validate the proposed scheme, and high accuracy is achieved in all cases
Precision and Recall are two prominent metrics of generative performance, which were proposed to separately measure the fidelity and diversity of generative models. Given their central role in comparing and improving generative models, understanding their limitations are crucially important. To that end, in this work, we identify a critical flaw in the common approximation of these metrics using k-nearest-neighbors, namely, that the very interpretations of fidelity and diversity that are assigned to Precision and Recall can fail in high dimensions, resulting in very misleading conclusions. Specifically, we empirically and theoretically show that as the number of dimensions grows, two model distributions with supports at equal point-wise distance from the support of the real distribution, can have vastly different Precision and Recall regardless of their respective distributions, hence an emergent asymmetry in high dimensions. Based on our theoretical insights, we then provide simple yet effective modifications to these metrics to construct symmetric metrics regardless of the number of dimensions. Finally, we provide experiments on real-world datasets to illustrate that the identified flaw is not merely a pathological case, and that our proposed metrics are effective in alleviating its impact.
In this paper, we use the optimization formulation of nonlinear Kalman filtering and smoothing problems to develop second-order variants of iterated Kalman smoother (IKS) methods. We show that Newton's method corresponds to a recursion over affine smoothing problems on a modified state-space model augmented by a pseudo measurement. The first and second derivatives required in this approach can be efficiently computed with widely available automatic differentiation tools. Furthermore, we show how to incorporate line-search and trust-region strategies into the proposed second-order IKS algorithm in order to regularize updates between iterations. Finally, we provide numerical examples to demonstrate the method's efficiency in terms of runtime compared to its batch counterpart.
Position based dynamics is a powerful technique for simulating a variety of materials. Its primary strength is its robustness when run with limited computational budget. We develop a novel approach to address problems with PBD for quasistatic hyperelastic materials. Even though PBD is based on the projection of static constraints, PBD is best suited for dynamic simulations. This is particularly relevant since the efficient creation of large data sets of plausible, but not necessarily accurate elastic equilibria is of increasing importance with the emergence of quasistatic neural networks. Furthermore, PBD projects one constraint at a time. We show that ignoring the effects of neighboring constraints limits its convergence and stability properties. Recent works have shown that PBD can be related to the Gauss-Seidel approximation of a Lagrange multiplier formulation of backward Euler time stepping, where each constraint is solved/projected independently of the others in an iterative fashion. We show that a position-based, rather than constraint-based nonlinear Gauss-Seidel approach solves these problems. Our approach retains the essential PBD feature of stable behavior with constrained computational budgets, but also allows for convergent behavior with expanded budgets. We demonstrate the efficacy of our method on a variety of representative hyperelastic problems and show that both successive over relaxation (SOR) and Chebyshev acceleration can be easily applied.
Recently proposed Generalized Time-domain Velocity Vector (GTVV) is a generalization of relative room impulse response in spherical harmonic (aka Ambisonic) domain that allows for blind estimation of early-echo parameters: the directions and relative delays of individual reflections. However, the derived closed-form expression of GTVV mandates few assumptions to hold, most important being that the impulse response of the reference signal needs to be a minimum-phase filter. In practice, the reference is obtained by spatial filtering towards the Direction-of-Arrival of the source, and the aforementioned condition is bounded by the performance of the applied beamformer (and thus, by the Ambisonic array order). In the present work, we suggest to circumvent this problem by properly modelling the GTVV time series, which permits not only to relax the initial assumptions, but also to extract the information therein is a more consistent and efficient manner, entering the realm of blind system identification. Experiments using measured room impulse responses confirm the effectiveness of the proposed approach.
For sets $\mathcal Q$ and $\mathcal Y$, the generalized Fr\'echet mean $m \in \mathcal Q$ of a random variable $Y$, which has values in $\mathcal Y$, is any minimizer of $q\mapsto \mathbb E[\mathfrak c(q,Y)]$, where $\mathfrak c \colon \mathcal Q \times \mathcal Y \to \mathbb R$ is a cost function. There are little restrictions to $\mathcal Q$ and $\mathcal Y$. In particular, $\mathcal Q$ can be a non-Euclidean metric space. We provide convergence rates for the empirical generalized Fr\'echet mean. Conditions for rates in probability and rates in expectation are given. In contrast to previous results on Fr\'echet means, we do not require a finite diameter of the $\mathcal Q$ or $\mathcal Y$. Instead, we assume an inequality, which we call quadruple inequality. It generalizes an otherwise common Lipschitz condition on the cost function. This quadruple inequality is known to hold in Hadamard spaces. We show that it also holds in a suitable way for certain powers of a Hadamard-metric.
We consider the problem of computing a grevlex Gr\"obner basis for the set $F_r(M)$ of minors of size $r$ of an $n\times n$ matrix $M$ of generic linear forms over a field of characteristic zero or large enough. Such sets are not regular sequences; in fact, the ideal $\langle F_r(M) \rangle$ cannot be generated by a regular sequence. As such, when using the general-purpose algorithm $F_5$ to find the sought Gr\"obner basis, some computing time is wasted on reductions to zero. We use known results about the first syzygy module of $F_r(M)$ to refine the $F_5$ algorithm in order to detect more reductions to zero. In practice, our approach avoids a significant number of reductions to zero. In particular, in the case $r=n-2$, we prove that our new algorithm avoids all reductions to zero, and we provide a corresponding complexity analysis which improves upon the previously known estimates.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.