Recovery of sparse vectors and low-rank matrices from a small number of linear measurements is well-known to be possible under various model assumptions on the measurements. The key requirement on the measurement matrices is typically the restricted isometry property, that is, approximate orthonormality when acting on the subspace to be recovered. Among the most widely used random matrix measurement models are (a) independent sub-gaussian models and (b) randomized Fourier-based models, allowing for the efficient computation of the measurements. For the now ubiquitous tensor data, direct application of the known recovery algorithms to the vectorized or matricized tensor is awkward and memory-heavy because of the huge measurement matrices to be constructed and stored. In this paper, we propose modewise measurement schemes based on sub-gaussian and randomized Fourier measurements. These modewise operators act on the pairs or other small subsets of the tensor modes separately. They require significantly less memory than the measurements working on the vectorized tensor, provably satisfy the tensor restricted isometry property and experimentally can recover the tensor data from fewer measurements and do not require impractical storage.
The metric dimension dim(G) of a graph $G$ is the minimum cardinality of a subset $S$ of vertices of $G$ such that each vertex of $G$ is uniquely determined by its distances to $S$. It is well-known that the metric dimension of a graph can be drastically increased by the modification of a single edge. Our main result consists in proving that the increase of the metric dimension of an edge addition can be amortized in the sense that if the graph consists of a spanning tree $T$ plus $c$ edges, then the metric dimension of $G$ is at most the metric dimension of $T$ plus $6c$. We then use this result to prove a weakening of a conjecture of Eroh et al. The zero forcing number $Z(G)$ of $G$ is the minimum cardinality of a subset $S$ of black vertices (whereas the other vertices are colored white) of $G$ such that all the vertices will turned black after applying finitely many times the following rule: a white vertex is turned black if it is the only white neighbor of a black vertex. Eroh et al. conjectured that, for any graph $G$, $dim(G)\leq Z(G) + c(G)$, where $c(G)$ is the number of edges that have to be removed from $G$ to get a forest. They proved the conjecture is true for trees and unicyclic graphs. We prove a weaker version of the conjecture: $dim(G)\leq Z(G)+6c(G)$ holds for any graph. We also prove that the conjecture is true for graphs with edge disjoint cycles, widely generalizing the unicyclic result of Eroh et al.
We investigate the equilibrium behavior for the decentralized cheap talk problem for real random variables and quadratic cost criteria in which an encoder and a decoder have misaligned objective functions. In prior work, it has been shown that the number of bins in any equilibrium has to be countable, generalizing a classical result due to Crawford and Sobel who considered sources with density supported on $[0,1]$. In this paper, we first refine this result in the context of log-concave sources. For sources with two-sided unbounded support, we prove that, for any finite number of bins, there exists a unique equilibrium. In contrast, for sources with semi-unbounded support, there may be a finite upper bound on the number of bins in equilibrium depending on certain conditions stated explicitly. Moreover, we prove that for log-concave sources, the expected costs of the encoder and the decoder in equilibrium decrease as the number of bins increases. Furthermore, for strictly log-concave sources with two-sided unbounded support, we prove convergence to the unique equilibrium under best response dynamics which starts with a given number of bins, making a connection with the classical theory of optimal quantization and convergence results of Lloyd's method. In addition, we consider more general sources which satisfy certain assumptions on the tail(s) of the distribution and we show that there exist equilibria with infinitely many bins for sources with two-sided unbounded support. Further explicit characterizations are provided for sources with exponential, Gaussian, and compactly-supported probability distributions.
We design an algorithm for computing connectivity in hypergraphs which runs in time $\hat O_r(p + \min\{\lambda^{\frac{r-3}{r-1}} n^2, n^r/\lambda^{r/(r-1)}\})$ (the $\hat O_r(\cdot)$ hides the terms subpolynomial in the main parameter and terms that depend only on $r$) where $p$ is the size, $n$ is the number of vertices, and $r$ is the rank of the hypergraph. Our algorithm is faster than existing algorithms when the the rank is constant and the connectivity $\lambda$ is $\omega(1)$. At the heart of our algorithm is a structural result regarding min-cuts in simple hypergraphs. We show a trade-off between the number of hyperedges taking part in all min-cuts and the size of the smaller side of the min-cut. This structural result can be viewed as a generalization of a well-known structural theorem for simple graphs [Kawarabayashi-Thorup, JACM 19]. We extend the framework of expander decomposition to simple hypergraphs in order to prove this structural result. We also make the proof of the structural result constructive to obtain our faster hypergraph connectivity algorithm.
The CP decomposition for high dimensional non-orthogonal spiked tensors is an important problem with broad applications across many disciplines. However, previous works with theoretical guarantee typically assume restrictive incoherence conditions on the basis vectors for the CP components. In this paper, we propose new computationally efficient composite PCA and concurrent orthogonalization algorithms for tensor CP decomposition with theoretical guarantees under mild incoherence conditions. The composite PCA applies the principal component or singular value decompositions twice, first to a matrix unfolding of the tensor data to obtain singular vectors and then to the matrix folding of the singular vectors obtained in the first step. It can be used as an initialization for any iterative optimization schemes for the tensor CP decomposition. The concurrent orthogonalization algorithm iteratively estimates the basis vector in each mode of the tensor by simultaneously applying projections to the orthogonal complements of the spaces generated by others CP components in other modes. It is designed to improve the alternating least squares estimator and other forms of the high order orthogonal iteration for tensors with low or moderately high CP ranks, and it is guaranteed to converge rapidly when the error of any given initial estimator is bounded by a small constant. Our theoretical investigation provides estimation accuracy and convergence rates for the two proposed algorithms. Our implementations on synthetic data demonstrate significant practical superiority of our approach over existing methods.
Very often, in the course of uncertainty quantification tasks or data analysis, one has to deal with high-dimensional random variables (RVs). A high-dimensional RV can be described by its probability density (pdf) and/or by the corresponding probability characteristic functions (pcf), or by a polynomial chaos (PCE) or similar expansion. Here the interest is mainly to compute characterisations like the entropy, or relations between two distributions, like their Kullback-Leibler divergence. These are all computed from the pdf, which is often not available directly, and it is a computational challenge to even represent it in a numerically feasible fashion in case the dimension $d$ is even moderately large. In this regard, we propose to represent the density by a high order tensor product, and approximate this in a low-rank format. We show how to go from the pcf or functional representation to the pdf. This allows us to reduce the computational complexity and storage cost from an exponential to a linear. The characterisations such as entropy or the $f$-divergences need the possibility to compute point-wise functions of the pdf. This normally rather trivial task becomes more difficult when the pdf is approximated in a low-rank tensor format, as the point values are not directly accessible any more. The data is considered as an element of a high order tensor space. The considered algorithms are independent of the representation of the data as a tensor. All that we require is that the data can be considered as an element of an associative, commutative algebra with an inner product. Such an algebra is isomorphic to a commutative sub-algebra of the usual matrix algebra, allowing the use of matrix algorithms to accomplish the mentioned tasks.
Recently, the low-rank property of different components extracted from the image has been considered in man hyperspectral image denoising methods. However, these methods usually unfold the 3D tensor to 2D matrix or 1D vector to exploit the prior information, such as nonlocal spatial self-similarity (NSS) and global spectral correlation (GSC), which break the intrinsic structure correlation of hyperspectral image (HSI) and thus lead to poor restoration quality. In addition, most of them suffer from heavy computational burden issues due to the involvement of singular value decomposition operation on matrix and tensor in the original high-dimensionality space of HSI. We employ subspace representation and the weighted low-rank tensor regularization (SWLRTR) into the model to remove the mixed noise in the hyperspectral image. Specifically, to employ the GSC among spectral bands, the noisy HSI is projected into a low-dimensional subspace which simplified calculation. After that, a weighted low-rank tensor regularization term is introduced to characterize the priors in the reduced image subspace. Moreover, we design an algorithm based on alternating minimization to solve the nonconvex problem. Experiments on simulated and real datasets demonstrate that the SWLRTR method performs better than other hyperspectral denoising methods quantitatively and visually.
This paper studies the expressive power of artificial neural networks (NNs) with rectified linear units. To study them as a model of real-valued computation, we introduce the concept of Max-Affine Arithmetic Programs and show equivalence between them and NNs concerning natural complexity measures. We then use this result to show that two fundamental combinatorial optimization problems can be solved with polynomial-size NNs, which is equivalent to the existence of very special strongly polynomial time algorithms. First, we show that for any undirected graph with $n$ nodes, there is an NN of size $\mathcal{O}(n^3)$ that takes the edge weights as input and computes the value of a minimum spanning tree of the graph. Second, we show that for any directed graph with $n$ nodes and $m$ arcs, there is an NN of size $\mathcal{O}(m^2n^2)$ that takes the arc capacities as input and computes a maximum flow. These results imply in particular that the solutions of the corresponding parametric optimization problems where all edge weights or arc capacities are free parameters can be encoded in polynomial space and evaluated in polynomial time, and that such an encoding is provided by an NN.
Many well-known matrices $Z$ are associated to fast transforms corresponding to factorizations of the form $Z = X^J \ldots X^1$, where each factor $X^\ell$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations. Our first main contribution is to prove that any $N \times N$ matrix having the so-called butterfly structure admits a unique factorization into $J$ butterfly factors (where $N = 2^J$), and that the factors can be recovered by a hierarchical factorization method. This contrasts with existing approaches which fit the product of the butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorizations of the Hadamard or the Discrete Fourier Transform matrices of size $2^J$. Computing such factorizations costs $\mathcal{O}(N^2)$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $\mathcal{O}(N \log N)$ matrix-vector multiplications. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting that was recently established. While the butterfly structure corresponds to a fixed prescribed support for each factor, our second contribution is to obtain identifiability results with more general families of allowed sparsity patterns, taking into account permutation ambiguities when they are unavoidable. Typically, we show through the hierarchical paradigm that the butterfly factorization of the Discrete Fourier Transform matrix of size $2^J$ admits a unique sparse factorization into $J$ factors, when enforcing only $2$-sparsity by column and a block-diagonal structure on each factor.
A key advantage of isogeometric discretizations is their accurate and well-behaved eigenfrequencies and eigenmodes. For degree two and higher, however, optical branches of spurious outlier frequencies and modes may appear due to boundaries or reduced continuity at patch interfaces. In this paper, we introduce a variational approach based on perturbed eigenvalue analysis that eliminates outlier frequencies without negatively affecting the accuracy in the remainder of the spectrum and modes. We then propose a pragmatic iterative procedure that estimates the perturbation parameters in such a way that the outlier frequencies are effectively reduced. We demonstrate that our approach allows for a much larger critical time-step size in explicit dynamics calculations. In addition, we show that the critical time-step size obtained with the proposed approach does not depend on the polynomial degree of spline basis functions.
Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D$^3$M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D$^3$M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D$^3$M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.