Low-rank matrix completion consists of computing a matrix of minimal complexity that recovers a given set of observations as accurately as possible, and has numerous applications such as product recommendation. Unfortunately, existing methods for solving low-rank matrix completion are heuristics that, while highly scalable and often identifying high-quality solutions, do not possess any optimality guarantees. We reexamine matrix completion with an optimality-oriented eye, by reformulating low-rank problems as convex problems over the non-convex set of projection matrices and implementing a disjunctive branch-and-bound scheme that solves them to certifiable optimality. Further, we derive a novel and often tight class of convex relaxations by decomposing a low-rank matrix as a sum of rank-one matrices and incentivizing, via a Shor relaxation, that each two-by-two minor in each rank-one matrix has determinant zero. In numerical experiments, our new convex relaxations decrease the optimality gap by two orders of magnitude compared to existing attempts. Moreover, we showcase the performance of our disjunctive branch-and-bound scheme and demonstrate that it solves matrix completion problems over 150x150 matrices to certifiable optimality in hours, constituting an order of magnitude improvement on the state-of-the-art for certifiably optimal methods.
The theory of mixed finite element methods for solving different types of elliptic partial differential equations in saddle-point formulation is well established since many decades. However, this topic was mostly studied for variational formulations defined upon the same finite-element product spaces of both shape- and test-pairs of primal variable-multiplier. Whenever these two product spaces are different the saddle point problem is asymmetric. It turns out that the conditions to be satisfied by the finite elements product spaces stipulated in the few works on this case may be of limited use in practice. The purpose of this paper is to provide an in-depth analysis of the well-posedness and the uniform stability of asymmetric approximate saddle point problems, based on the theory of continuous linear operators on Hilbert spaces. Our approach leads to necessary and sufficient conditions for such properties to hold, expressed in a readily exploitable form with fine constants. In particular standard interpolation theory suffices to estimate the error of a conforming method.
In phase-only compressive sensing (PO-CS), our goal is to recover low-complexity signals (e.g., sparse signals, low-rank matrices) from the phase of complex linear measurements. While perfect recovery of signal direction in PO-CS was observed quite early, the exact reconstruction guarantee for a fixed, real signal was recently done by Jacques and Feuillen [IEEE Trans. Inf. Theory, 67 (2021), pp. 4150-4161]. However, two questions remain open: the uniform recovery guarantee and exact recovery of complex signal. In this paper, we almost completely address these two open questions. We prove that, all complex sparse signals or low-rank matrices can be uniformly, exactly recovered from a near optimal number of complex Gaussian measurement phases. By recasting PO-CS as a linear compressive sensing problem, the exact recovery follows from restricted isometry property (RIP). Our approach to uniform recovery guarantee is based on covering arguments that involve a delicate control of the (original linear) measurements with overly small magnitude. To work with complex signal, a different sign-product embedding property and a careful rescaling of the sensing matrix are employed. In addition, we show an extension that the uniform recovery is stable under moderate bounded noise. We also propose to add Gaussian dither before capturing the phases to achieve full reconstruction with norm information. Experimental results are reported to corroborate and demonstrate our theoretical results.
We study online convex optimization where the possible actions are trace-one elements in a symmetric cone, generalizing the extensively-studied experts setup and its quantum counterpart. Symmetric cones provide a unifying framework for some of the most important optimization models, including linear, second-order cone, and semidefinite optimization. Using tools from the field of Euclidean Jordan Algebras, we introduce the Symmetric-Cone Multiplicative Weights Update (SCMWU), a projection-free algorithm for online optimization over the trace-one slice of an arbitrary symmetric cone. We show that SCMWU is equivalent to Follow-the-Regularized-Leader and Online Mirror Descent with symmetric-cone negative entropy as regularizer. Using this structural result we show that SCMWU is a no-regret algorithm, and verify our theoretical results with extensive experiments. Our results unify and generalize the analysis for the Multiplicative Weights Update method over the probability simplex and the Matrix Multiplicative Weights Update method over the set of density matrices.
Recommendation algorithm plays an important role in recommendation system (RS), which predicts users' interests and preferences for some given items based on their known information. Recently, a recommendation algorithm based on the graph Laplacian regularization was proposed, which treats the prediction problem of the recommendation system as a reconstruction issue of small samples of the graph signal under the same graph model. Such a technique takes into account both known and unknown labeled samples information, thereby obtaining good prediction accuracy. However, when the data size is large, solving the reconstruction model is computationally expensive even with an approximate strategy. In this paper, we propose an equivalent reconstruction model that can be solved exactly with extremely low computational cost. Finally, a final prediction algorithm is proposed. We find in the experiments that the proposed method significantly reduces the computational cost while maintaining a good prediction accuracy.
This paper examines the approximation of log-determinant for large-scale symmetric positive definite matrices. Inspired by the variance reduction technique, we split the approximation of $\log\det(A)$ into two parts. The first to compute is the trace of the projection of $\log(A)$ onto a suboptimal subspace, while the second is the trace of the projection on the corresponding orthogonal complementary space. For these two approximations, the stochastic Lanczos quadrature method is used. Furthermore, in the construction of the suboptimal subspace, we utilize a projection-cost-preserving sketch to bound the size of the Gaussian random matrix and the dimension of the suboptimal subspace. We provide a rigorous error analysis for our proposed method and explicit lower bounds for its design parameters, offering guidance for practitioners. We conduct numerical experiments to demonstrate our method's effectiveness and illustrate the quality of the derived bounds.
Optimization tasks are crucial in statistical machine learning. Recently, there has been great interest in leveraging tools from dynamical systems to derive accelerated and robust optimization methods via suitable discretizations of continuous-time systems. However, these ideas have mostly been limited to Euclidean spaces and unconstrained settings, or to Riemannian gradient flows. In this work, we propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems over smooth manifolds, including problems with nonlinear constraints. We develop geometric/symplectic numerical integrators on manifolds that are "rate-matching," i.e., preserve the continuous-time rates of convergence. In particular, we introduce a dissipative RATTLE integrator able to achieve optimal convergence rate locally. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
In this paper, we study the problems of detection and recovery of hidden submatrices with elevated means inside a large Gaussian random matrix. We consider two different structures for the planted submatrices. In the first model, the planted matrices are disjoint, and their row and column indices can be arbitrary. Inspired by scientific applications, the second model restricts the row and column indices to be consecutive. In the detection problem, under the null hypothesis, the observed matrix is a realization of independent and identically distributed standard normal entries. Under the alternative, there exists a set of hidden submatrices with elevated means inside the same standard normal matrix. Recovery refers to the task of locating the hidden submatrices. For both problems, and for both models, we characterize the statistical and computational barriers by deriving information-theoretic lower bounds, designing and analyzing algorithms matching those bounds, and proving computational lower bounds based on the low-degree polynomials conjecture. In particular, we show that the space of the model parameters (i.e., number of planted submatrices, their dimensions, and elevated mean) can be partitioned into three regions: the impossible regime, where all algorithms fail; the hard regime, where while detection or recovery are statistically possible, we give some evidence that polynomial-time algorithm do not exist; and finally the easy regime, where polynomial-time algorithms exist.
We consider the optimization of a smooth and strongly convex objective using constant step-size stochastic gradient descent (SGD) and study its properties through the prism of Markov chains. We show that, for unbiased gradient estimates with mildly controlled variance, the iteration converges to an invariant distribution in total variation distance. We also establish this convergence in Wasserstein-2 distance in a more general setting compared to previous work. Thanks to the invariance property of the limit distribution, our analysis shows that the latter inherits sub-Gaussian or sub-exponential concentration properties when these hold true for the gradient. This allows the derivation of high-confidence bounds for the final estimate. Finally, under such conditions in the linear case, we obtain a dimension-free deviation bound for the Polyak-Ruppert average of a tail sequence. All our results are non-asymptotic and their consequences are discussed through a few applications.
The accurate and efficient simulation of Partial Differential Equations (PDEs) in and around arbitrarily defined geometries is critical for many application domains. Immersed boundary methods (IBMs) alleviate the usually laborious and time-consuming process of creating body-fitted meshes around complex geometry models (described by CAD or other representations, e.g., STL, point clouds), especially when high levels of mesh adaptivity are required. In this work, we advance the field of IBM in the context of the recently developed Shifted Boundary Method (SBM). In the SBM, the location where boundary conditions are enforced is shifted from the actual boundary of the immersed object to a nearby surrogate boundary, and boundary conditions are corrected utilizing Taylor expansions. This approach allows choosing surrogate boundaries that conform to a Cartesian mesh without losing accuracy or stability. Our contributions in this work are as follows: (a) we show that the SBM numerical error can be greatly reduced by an optimal choice of the surrogate boundary, (b) we mathematically prove the optimal convergence of the SBM for this optimal choice of the surrogate boundary, (c) we deploy the SBM on massively parallel octree meshes, including algorithmic advances to handle incomplete octrees, and (d) we showcase the applicability of these approaches with a wide variety of simulations involving complex shapes, sharp corners, and different topologies. Specific emphasis is given to Poisson's equation and the linear elasticity equations.
In this paper, we consider estimating spot/instantaneous volatility matrices of high-frequency data collected for a large number of assets. We first combine classic nonparametric kernel-based smoothing with a generalised shrinkage technique in the matrix estimation for noise-free data under a uniform sparsity assumption, a natural extension of the approximate sparsity commonly used in the literature. The uniform consistency property is derived for the proposed spot volatility matrix estimator with convergence rates comparable to the optimal minimax one. For the high-frequency data contaminated by microstructure noise, we introduce a localised pre-averaging estimation method that reduces the effective magnitude of the noise. We then use the estimation tool developed in the noise-free scenario, and derive the uniform convergence rates for the developed spot volatility matrix estimator. We further combine the kernel smoothing with the shrinkage technique to estimate the time-varying volatility matrix of the high-dimensional noise vector. In addition, we consider large spot volatility matrix estimation in time-varying factor models with observable risk factors and derive the uniform convergence property. We provide numerical studies including simulation and empirical application to examine the performance of the proposed estimation methods in finite samples.