Quadratic programming over orthogonal matrices encompasses a broad class of hard optimization problems that do not have an efficient quantum representation. Such problems are instances of the little noncommutative Grothendieck problem (LNCG), a generalization of binary quadratic programs to continuous, noncommutative variables. In this work, we establish a natural embedding for this class of LNCG problems onto a fermionic Hamiltonian, thereby enabling the study of this classical problem with the tools of quantum information. This embedding is accomplished by identifying the orthogonal group with its double cover, which can be represented as fermionic quantum states. Correspondingly, the embedded LNCG Hamiltonian is a two-body fermion model. Determining extremal states of this Hamiltonian provides an outer approximation to the original problem, a quantum analogue to classical semidefinite relaxations. In particular, when optimizing over the special orthogonal group our quantum relaxation obeys additional, powerful constraints based on the convex hull of rotation matrices. The classical size of this convex-hull representation is exponential in matrix dimension, whereas our quantum representation requires only a linear number of qubits. Finally, to project the relaxed solution back into the feasible space, we propose rounding procedures which return orthogonal matrices from appropriate measurements of the quantum state. Through numerical experiments we provide evidence that this rounded quantum relaxation can produce high-quality approximations.
Kinetic approaches are generally accurate in dealing with microscale plasma physics problems but are computationally expensive for large-scale or multiscale systems. One of the long-standing problems in plasma physics is the integration of kinetic physics into fluid models, which is often achieved through sophisticated analytical closure terms. In this paper, we successfully construct a multi-moment fluid model with an implicit fluid closure included in the neural network using machine learning. The multi-moment fluid model is trained with a small fraction of sparsely sampled data from kinetic simulations of Landau damping, using the physics-informed neural network (PINN) and the gradient-enhanced physics-informed neural network (gPINN). The multi-moment fluid model constructed using either PINN or gPINN reproduces the time evolution of the electric field energy, including its damping rate, and the plasma dynamics from the kinetic simulations. In addition, we introduce a variant of the gPINN architecture, namely, gPINN$p$ to capture the Landau damping process. Instead of including the gradients of all the equation residuals, gPINN$p$ only adds the gradient of the pressure equation residual as one additional constraint. Among the three approaches, the gPINN$p$-constructed multi-moment fluid model offers the most accurate results. This work sheds light on the accurate and efficient modeling of large-scale systems, which can be extended to complex multiscale laboratory, space, and astrophysical plasma physics problems.
The power of Clifford or, geometric, algebra lies in its ability to represent geometric operations in a concise and elegant manner. Clifford algebras provide the natural generalizations of complex, dual numbers and quaternions into non-commutative multivectors. The paper demonstrates an algorithm for the computation of inverses of such numbers in a non-degenerate Clifford algebra of an arbitrary dimension. The algorithm is a variation of the Faddeev-LeVerrier-Souriau algorithm and is implemented in the open-source Computer Algebra System Maxima. Symbolic and numerical examples in different Clifford algebras are presented.
In estimation of a normal mean matrix under the matrix quadratic loss, we develop a general formula for the matrix quadratic risk of orthogonally invariant estimators. The derivation is based on several formulas for matrix derivatives of orthogonally invariant functions of matrices. As an application, we calculate the matrix quadratic risk of a singular value shrinkage estimator motivated by Stein's proposal for improving on the Efron--Morris estimator 50 years ago.
We discuss applications of exact structures and relative homological algebra to the study of invariants of multiparameter persistence modules. This paper is mostly expository, but does contain a pair of novel results. Over finite posets, classical arguments about the relative projective modules of an exact structure make use of Auslander-Reiten theory. One of our results establishes a new adjunction which allows us to "lift" these arguments to certain infinite posets over which Auslander-Reiten theory is not available. We give several examples of this lifting, in particular highlighting the non-existence and existence of resolutions by upsets when working with finitely presentable representations of the plane and of the closure of the positive quadrant, respectively. We then restrict our attention to finite posets. In this setting, we discuss the relationship between the global dimension of an exact structure and the representation dimension of the incidence algebra of the poset. We conclude with our second novel contribution. This is an explicit description of the irreducible morphisms between relative projective modules for several exact structures which have appeared previously in the literature.
We consider finite element approximations to the optimal constant for the Hardy inequality with exponent $p=2$ in bounded domains of dimension $n=1$ or $n\geq 3$. For finite element spaces of piecewise linear and continuous functions on a mesh of size $h$, we prove that the approximate Hardy constant, $S_h^n$, converges to the optimal Hardy constant $S^n$ no slower than $O(1/\vert \log h \vert)$. We also show that the convergence is no faster than $O(1/\vert \log h \vert^2)$ if $n=1$ or if $n\geq 3$, the domain is the unit ball, and the finite element discretization exploits the rotational symmetry of the problem. Our estimates are compared to exact values for $S_h^n$ obtained computationally.
We propose a new discrete choice model, called the generalized stochastic preference (GSP) model, that incorporates non-rationality into the stochastic preference (SP) choice model, also known as the rank- based choice model. Our model can explain several choice phenomena that cannot be represented by any SP model such as the compromise and attraction effects, but still subsumes the SP model class. The GSP model is defined as a distribution over consumer types, where each type extends the choice behavior of rational types in the SP model. We build on existing methods for estimating the SP model and propose an iterative estimation algorithm for the GSP model that finds new types by solving a integer linear program in each iteration. We further show that our proposed notion of non-rationality can be incorporated into other choice models, like the random utility maximization (RUM) model class as well as any of its subclasses. As a concrete example, we introduce the non-rational extension of the classical MNL model, which we term the generalized MNL (GMNL) model and present an efficient expectation-maximization (EM) algorithm for estimating the GMNL model. Numerical evaluation on real choice data shows that the GMNL and GSP models can outperform their rational counterparts in out-of-sample prediction accuracy.
Algorithms for solving the linear classification problem have a long history, dating back at least to 1936 with linear discriminant analysis. For linearly separable data, many algorithms can obtain the exact solution to the corresponding 0-1 loss classification problem efficiently, but for data which is not linearly separable, it has been shown that this problem, in full generality, is NP-hard. Alternative approaches all involve approximations of some kind, including the use of surrogates for the 0-1 loss (for example, the hinge or logistic loss) or approximate combinatorial search, none of which can be guaranteed to solve the problem exactly. Finding efficient algorithms to obtain an exact i.e. globally optimal solution for the 0-1 loss linear classification problem with fixed dimension, remains an open problem. In research we report here, we detail the rigorous construction of a new algorithm, incremental cell enumeration (ICE), that can solve the 0-1 loss classification problem exactly in polynomial time. We prove correctness using concepts from the theory of hyperplane arrangements and oriented matroids. We demonstrate the effectiveness of this algorithm on synthetic and real-world datasets, showing optimal accuracy both in and out-of-sample, in practical computational time. We also empirically demonstrate how the use of approximate upper bound leads to polynomial time run-time improvements to the algorithm whilst retaining exactness. To our knowledge, this is the first, rigorously-proven polynomial time, practical algorithm for this long-standing problem.
We provide a framework for the numerical approximation of distributed optimal control problems, based on least-squares finite element methods. Our proposed method simultaneously solves the state and adjoint equations and is $\inf$--$\sup$ stable for any choice of conforming discretization spaces. A reliable and efficient a posteriori error estimator is derived for problems where box constraints are imposed on the control. It can be localized and therefore used to steer an adaptive algorithm. For unconstrained optimal control problems, i.e., the set of controls being a Hilbert space, we obtain a coercive least-squares method and, in particular, quasi-optimality for any choice of discrete approximation space. For constrained problems we derive and analyze a variational inequality where the PDE part is tackled by least-squares finite element methods. We show that the abstract framework can be applied to a wide range of problems, including scalar second-order PDEs, the Stokes problem, and parabolic problems on space-time domains. Numerical examples for some selected problems are presented.
High-dimensional Partial Differential Equations (PDEs) are a popular mathematical modelling tool, with applications ranging from finance to computational chemistry. However, standard numerical techniques for solving these PDEs are typically affected by the curse of dimensionality. In this work, we tackle this challenge while focusing on stationary diffusion equations defined over a high-dimensional domain with periodic boundary conditions. Inspired by recent progress in sparse function approximation in high dimensions, we propose a new method called compressive Fourier collocation. Combining ideas from compressive sensing and spectral collocation, our method replaces the use of structured collocation grids with Monte Carlo sampling and employs sparse recovery techniques, such as orthogonal matching pursuit and $\ell^1$ minimization, to approximate the Fourier coefficients of the PDE solution. We conduct a rigorous theoretical analysis showing that the approximation error of the proposed method is comparable with the best $s$-term approximation (with respect to the Fourier basis) to the solution. Using the recently introduced framework of random sampling in bounded Riesz systems, our analysis shows that the compressive Fourier collocation method mitigates the curse of dimensionality with respect to the number of collocation points under sufficient conditions on the regularity of the diffusion coefficient. We also present numerical experiments that illustrate the accuracy and stability of the method for the approximation of sparse and compressible solutions.
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.