The {\em tensor power method} generalizes the matrix power method to higher order arrays, or tensors. Like in the matrix case, the fixed points of the tensor power method are the eigenvectors of the tensor. While every real symmetric matrix has an eigendecomposition, the vectors generating a symmetric decomposition of a real symmetric tensor are not always eigenvectors of the tensor. In this paper we show that whenever an eigenvector {\em is} a generator of the symmetric decomposition of a symmetric tensor, then (if the order of the tensor is sufficiently high) this eigenvector is {\em robust} , i.e., it is an attracting fixed point of the tensor power method. We exhibit new classes of symmetric tensors whose symmetric decomposition consists of eigenvectors. Generalizing orthogonally decomposable tensors, we consider {\em equiangular tight frame decomposable} and {\em equiangular set decomposable} tensors. Our main result implies that such tensors can be decomposed using the tensor power method.
We study nonparametric Bayesian models for reversible multi-dimensional diffusions with periodic drift. For continuous observation paths, reversibility is exploited to prove a general posterior contraction rate theorem for the drift gradient vector field under approximation-theoretic conditions on the induced prior for the invariant measure. The general theorem is applied to Gaussian priors and $p$-exponential priors, which are shown to converge to the truth at the minimax optimal rate over Sobolev smoothness classes in any dimension
An $n$-person game is specified by $n$ tensors of the same format. We view its equilibria as points in that tensor space. Dependency equilibria are defined by linear constraints on conditional probabilities, and thus by determinantal quadrics in the tensor entries. These equations cut out the Spohn variety, named after the philosopher who introduced dependency equilibria. The Nash equilibria among these are the tensors of rank one. We study the real algebraic geometry of the Spohn variety. This variety is rational, except for $2 \times 2$ games, when it is an elliptic curve. For $3 \times 2$ games, it is a del Pezzo surface of degree two. We characterize the payoff regions and their boundaries using oriented matroids, and we develop the connection to Bayesian networks in statistics.
In this paper, we propose a new trace finite element method for the {Laplace-Beltrami} eigenvalue problem. The method is proposed directly on a smooth manifold which is implicitly given by a level-set function and require high order numerical quadrature on the surface. A comprehensive analysis for the method is provided. We show that the eigenvalues of the discrete Laplace-Beltrami operator coincide with only part of the eigenvalues of an embedded problem, which further corresponds to the finite eigenvalues for a singular generalized algebraic eigenvalue problem. The finite eigenvalues can be efficiently solved by a rank-completing perturbation algorithm in {\it Hochstenbach et al. SIAM J. Matrix Anal. Appl., 2019} \cite{hochstenbach2019solving}. We prove the method has optimal convergence rate. Numerical experiments verify the theoretical analysis and show that the geometric consistency can improve the numerical accuracy significantly.
In this work, we study a variant of nonnegative matrix factorization where we wish to find a symmetric factorization of a given input matrix into a sparse, Boolean matrix. Formally speaking, given $\mathbf{M}\in\mathbb{Z}^{m\times m}$, we want to find $\mathbf{W}\in\{0,1\}^{m\times r}$ such that $\| \mathbf{M} - \mathbf{W}\mathbf{W}^\top \|_0$ is minimized among all $\mathbf{W}$ for which each row is $k$-sparse. This question turns out to be closely related to a number of questions like recovering a hypergraph from its line graph, as well as reconstruction attacks for private neural network training. As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: $\mathbf{M} = \mathbf{W}\mathbf{W}^{\top}$ for $\mathbf{W}$ a random Boolean matrix with $k$-sparse rows, and the goal is to recover $\mathbf{W}$ up to column permutation. Equivalently, this can be thought of as recovering a uniformly random $k$-uniform hypergraph from its line graph. Our main result is a polynomial-time algorithm for this problem based on bootstrapping higher-order information about $\mathbf{W}$ and then decomposing an appropriate tensor. The key ingredient in our analysis, which may be of independent interest, is to show that such a matrix $\mathbf{W}$ has full column rank with high probability as soon as $m = \widetilde{\Omega}(r)$, which we do using tools from Littlewood-Offord theory and estimates for binary Krawtchouk polynomials.
While many works exploiting an existing Lie group structure have been proposed for state estimation, in particular the Invariant Extended Kalman Filter (IEKF), few papers address the construction of a group structure that allows casting a given system into the IEKF framework, namely making the dynamics group affine and the observations invariant. In this paper we introduce a large class of systems encompassing most problems involving a navigating vehicle encountered in practice. For those systems we introduce a novel methodology that systematically provides a group structure for the state space, including vectors of the body frame such as biases. We use it to derive observers having properties akin to those of linear observers or filters. The proposed unifying and versatile framework encompasses all systems where IEKF has proved successful, improves state-of-the art "imperfect" IEKF for inertial navigation with sensor biases, and allows addressing novel examples, like GNSS antenna lever arm estimation.
Due to the multi-linearity of tensors, most algorithms for tensor optimization problems are designed based on the block coordinate descent method. Such algorithms are widely employed by practitioners for their implementability and effectiveness. However, these algorithms usually suffer from the lack of theoretical guarantee of global convergence and analysis of convergence rate. In this paper, we propose a block coordinate descent type algorithm for the low rank partially orthogonal tensor approximation problem and analyse its convergence behaviour. To achieve this, we carefully investigate the variety of low rank partially orthogonal tensors and its geometric properties related to the parameter space, which enable us to locate KKT points of the concerned optimization problem. With the aid of these geometric properties, we prove without any assumption that: (1) Our algorithm converges globally to a KKT point; (2) For any given tensor, the algorithm exhibits an overall sublinear convergence with an explicit rate which is sharper than the usual $O(1/k)$ for first order methods in nonconvex optimization; {(3)} For a generic tensor, our algorithm converges $R$-linearly.
This paper uses the concept of algorithmic efficiency to present a unified theory of intelligence. Intelligence is defined informally, formally, and computationally. I introduce the concept of Dimensional complexity in algorithmic efficiency and deduce that an optimally efficient algorithm has zero Time complexity, zero Space complexity, and an infinite Dimensional complexity. This algorithm is then used to generate the number line.
Graph Convolutional Networks (GCNs) have recently become the primary choice for learning from graph-structured data, superseding hash fingerprints in representing chemical compounds. However, GCNs lack the ability to take into account the ordering of node neighbors, even when there is a geometric interpretation of the graph vertices that provides an order based on their spatial positions. To remedy this issue, we propose Geometric Graph Convolutional Network (geo-GCN) which uses spatial features to efficiently learn from graphs that can be naturally located in space. Our contribution is threefold: we propose a GCN-inspired architecture which (i) leverages node positions, (ii) is a proper generalisation of both GCNs and Convolutional Neural Networks (CNNs), (iii) benefits from augmentation which further improves the performance and assures invariance with respect to the desired properties. Empirically, geo-GCN outperforms state-of-the-art graph-based methods on image classification and chemical tasks.
Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.
Large margin nearest neighbor (LMNN) is a metric learner which optimizes the performance of the popular $k$NN classifier. However, its resulting metric relies on pre-selected target neighbors. In this paper, we address the feasibility of LMNN's optimization constraints regarding these target points, and introduce a mathematical measure to evaluate the size of the feasible region of the optimization problem. We enhance the optimization framework of LMNN by a weighting scheme which prefers data triplets which yield a larger feasible region. This increases the chances to obtain a good metric as the solution of LMNN's problem. We evaluate the performance of the resulting feasibility-based LMNN algorithm using synthetic and real datasets. The empirical results show an improved accuracy for different types of datasets in comparison to regular LMNN.