We apply random matrix theory to study the impact of measurement uncertainty on dynamic mode decomposition. Specifically, when the measurements follow a normal probability density function, we show how the moments of that density propagate through the dynamic mode decomposition. While we focus on the first and second moments, the analytical expressions we derive are general and can be extended to higher-order moments. Further, the proposed numerical method to propagate uncertainty is agnostic of specific dynamic mode decomposition formulations. Of particular relevance, the estimated second moments provide confidence bounds that may be used as a metric of trustworthiness, that is, how much one can rely on a finite-dimensional linear operator to represent an underlying dynamical system. We perform numerical experiments on two canonical systems and verify the estimated confidence levels by comparing the moments to those obtained from Monte Carlo simulations.
We propose a novel score-based particle method for solving the Landau equation in plasmas, that seamlessly integrates learning with structure-preserving particle methods [arXiv:1910.03080]. Building upon the Lagrangian viewpoint of the Landau equation, a central challenge stems from the nonlinear dependence of the velocity field on the density. Our primary innovation lies in recognizing that this nonlinearity is in the form of the score function, which can be approximated dynamically via techniques from score-matching. The resulting method inherits the conservation properties of the deterministic particle method while sidestepping the necessity for kernel density estimation in [arXiv:1910.03080]. This streamlines computation and enhances scalability with dimensionality. Furthermore, we provide a theoretical estimate by demonstrating that the KL divergence between our approximation and the true solution can be effectively controlled by the score-matching loss. Additionally, by adopting the flow map viewpoint, we derive an update formula for exact density computation. Extensive examples have been provided to show the efficiency of the method, including a physically relevant case of Coulomb interaction.
We derive optimal rates of convergence in the supremum norm for estimating the H\"older-smooth mean function of a stochastic process which is repeatedly and discretely observed with additional errors at fixed, multivariate, synchronous design points, the typical scenario for machine recorded functional data. Similarly to the optimal rates in $L_2$ obtained in \citet{cai2011optimal}, for sparse design a discretization term dominates, while in the dense case the parametric $\sqrt n$ rate can be achieved as if the $n$ processes were continuously observed without errors. The supremum norm is of practical interest since it corresponds to the visualization of the estimation error, and forms the basis for the construction uniform confidence bands. We show that in contrast to the analysis in $L_2$, there is an intermediate regime between the sparse and dense cases dominated by the contribution of the observation errors. Furthermore, under the supremum norm interpolation estimators which suffice in $L_2$ turn out to be sub-optimal in the dense setting, which helps to explain their poor empirical performance. In contrast to previous contributions involving the supremum norm, we discuss optimality even in the multivariate setting, and for dense design obtain the $\sqrt n$ rate of convergence without additional logarithmic factors. We also obtain a central limit theorem in the supremum norm, and provide simulations and real data applications to illustrate our results.
In this article, a posteriori error analysis of the elliptic obstacle problem is addressed using hybrid high-order methods. The method involve cell unknowns represented by degree-$r$ polynomials and face unknowns represented by degree-$s$ polynomials, where $r=0$ and $s$ is either $0$ or $1$. The discrete obstacle constraints are specifically applied to the cell unknowns. The analysis hinges on the construction of a suitable Lagrange multiplier, a residual functional and a linear averaging map. The reliability and the efficiency of the proposed a posteriori error estimator is discussed, and the study is concluded by numerical experiments supporting the theoretical results.
Semi-algebraic priors are ubiquitous in signal processing and machine learning. Prevalent examples include a) linear models where the signal lies in a low-dimensional subspace; b) sparse models where the signal can be represented by only a few coefficients under a suitable basis; and c) a large family of neural network generative models. In this paper, we prove a transversality theorem for semi-algebraic sets in orthogonal or unitary representations of groups: with a suitable dimension bound, a generic translate of any semi-algebraic set is transverse to the orbits of the group action. This, in turn, implies that if a signal lies in a low-dimensional semi-algebraic set, then it can be recovered uniquely from measurements that separate orbits. As an application, we consider the implications of the transversality theorem to the problem of recovering signals that are translated by random group actions from their second moment. As a special case, we discuss cryo-EM: a leading technology to constitute the spatial structure of biological molecules, which serves as our prime motivation. In particular, we derive explicit bounds for recovering a molecular structure from the second moment under a semi-algebraic prior and deduce information-theoretic implications. We also obtain information-theoretic bounds for three additional applications: factoring Gram matrices, multi-reference alignment, and phase retrieval. Finally, we deduce bounds for designing permutation invariant separators in machine learning.
Discretization of the uniform norm of functions from a given finite dimensional subspace of continuous functions is studied. Previous known results show that for any $N$-dimensional subspace of the space of continuous functions it is sufficient to use $e^{CN}$ sample points for an accurate upper bound for the uniform norm by the discrete norm and that one cannot improve on the exponential growth of the number of sampling points for a good discretization theorem in the uniform norm. In this paper we focus on two types of results, which allow us to obtain good discretization of the uniform norm with polynomial in $N$ number of points. In the first way we weaken the discretization inequality by allowing a bound of the uniform norm by the discrete norm multiplied by an extra factor, which may depend on $N$. In the second way we impose restrictions on the finite dimensional subspace under consideration. In particular, we prove a general result, which connects the upper bound on the number of sampling points in the discretization theorem for the uniform norm with the best $m$-term bilinear approximation of the Dirichlet kernel associated with the given subspace.
We describe a novel algorithm for solving general parametric (nonlinear) eigenvalue problems. Our method has two steps: first, high-accuracy solutions of non-parametric versions of the problem are gathered at some values of the parameters; these are then combined to obtain global approximations of the parametric eigenvalues. To gather the non-parametric data, we use non-intrusive contour-integration-based methods, which, however, cannot track eigenvalues that migrate into/out of the contour as the parameter changes. Special strategies are described for performing the combination-over-parameter step despite having only partial information on such migrating eigenvalues. Moreover, we dedicate a special focus to the approximation of eigenvalues that undergo bifurcations. Finally, we propose an adaptive strategy that allows one to effectively apply our method even without any a priori information on the behavior of the sought-after eigenvalues. Numerical tests are performed, showing that our algorithm can achieve remarkably high approximation accuracy.
The property of reversibility is quite meaningful for the classic theoretical computer science model, cellular automata. For the reversibility problem for a CA under null boundary conditions, while linear rules have been studied a lot, the non-linear rules remain unexplored at present. The paper investigates the reversibility problem of general one-dimensional CA on a finite field $\mathbb{Z}_p$, and proposes an approach to optimize the Amoroso's infinite CA surjectivity detection algorithm. This paper proposes algorithms for deciding the reversibility of one-dimensional CA under null boundary conditions. We propose a method to decide the strict reversibility of one-dimensional CA under null boundary conditions. We also provide a bucket chain based algorithm for calculating the reversibility function of one-dimensional CA under null boundary conditions. These decision algorithms work for not only linear rules but also non-linear rules. In addition, it has been confirmed that the reversibility function always has a period, and its periodicity is related to the periodicity of the corresponding bucket chain. Some of our experiment results of reversible CA are presented in the paper, complementing and validating the theoretical aspects, and thereby further supporting the research conclusions of this paper.
In reinsurance, Poisson and Negative binomial distributions are employed for modeling frequency. However, the incomplete data regarding reported incurred claims above a priority level presents challenges in estimation. This paper focuses on frequency estimation using Schnieper's framework for claim numbering. We demonstrate that Schnieper's model is consistent with a Poisson distribution for the total number of claims above a priority at each year of development, providing a robust basis for parameter estimation. Additionally, we explain how to build an alternative assumption based on a Negative binomial distribution, which yields similar results. The study includes a bootstrap procedure to manage uncertainty in parameter estimation and a case study comparing assumptions and evaluating the impact of the bootstrap approach.
We consider the computation of statistical moments to operator equations with stochastic data. We remark that application of PINNs -- referred to as TPINNs -- allows to solve the induced tensor operator equations under minimal changes of existing PINNs code, and enabling handling of non-linear and time-dependent operators. We propose two types of architectures, referred to as vanilla and multi-output TPINNs, and investigate their benefits and limitations. Exhaustive numerical experiments are performed; demonstrating applicability and performance; raising a variety of new promising research avenues.
The log-rank conjecture, a longstanding problem in communication complexity, has persistently eluded resolution for decades. Consequently, some recent efforts have focused on potential approaches for establishing the conjecture in the special case of XOR functions, where the communication matrix is lifted from a boolean function, and the rank of the matrix equals the Fourier sparsity of the function, which is the number of its nonzero Fourier coefficients. In this note, we refute two conjectures. The first has origins in Montanaro and Osborne (arXiv'09) and is considered in Tsang et al. (FOCS'13), and the second one is due to Mande and Sanyal (FSTTCS'20). These conjectures were proposed in order to improve the best-known bound of Lovett (STOC'14) regarding the log-rank conjecture in the special case of XOR functions. Both conjectures speculate that the set of nonzero Fourier coefficients of the boolean function has some strong additive structure. We refute these conjectures by constructing two specific boolean functions tailored to each.