Sparse binary matrices are of great interest in the field of compressed sensing. This class of matrices make possible to perform signal recovery with lower storage costs and faster decoding algorithms. In particular, random matrices formed by i.i.d Bernoulli $p$ random variables are of practical relevance in the context of nonnegative sparse recovery. In this work, we investigate the robust nullspace property of sparse Bernoulli $p$ matrices. Previous results in the literature establish that such matrices can accurately recover $n$-dimensional $s$-sparse vectors with $m=O\left (\frac{s}{c(p)}\log\frac{en}{s}\right )$ measurements, where $c(p) \le p$ is a constant that only depends on the parameter $p$. These results suggest that, when $p$ vanishes, the sparse Bernoulli matrix requires considerably more measurements than the minimal necessary achieved by the standard isotropic subgaussian designs. We show that this is not true. Our main result characterizes, for a wide range of levels sparsity $s$, the smallest $p$ such that it is possible to perform sparse recovery with the minimal number of measurements. We also provide matching lower bounds to establish the optimality of our results.
The hard thresholding technique plays a vital role in the development of algorithms for sparse signal recovery. By merging this technique and heavy-ball acceleration method which is a multi-step extension of the traditional gradient descent method, we propose the so-called heavy-ball-based hard thresholding (HBHT) and heavy-ball-based hard thresholding pursuit (HBHTP) algorithms for signal recovery. It turns out that the HBHT and HBHTP can successfully recover a $k$-sparse signal if the restricted isometry constant of the measurement matrix satisfies $\delta_{3k}<0.618 $ and $\delta_{3k}<0.577,$ respectively. The guaranteed success of HBHT and HBHTP is also shown under the conditions $\delta_{2k}<0.356$ and $\delta_{2k}<0.377,$ respectively. Moreover, the finite convergence and stability of the two algorithms are also established in this paper. Simulations on random problem instances are performed to compare the performance of the proposed algorithms and several existing ones. Empirical results indicate that the HBHTP performs very comparably to a few existing algorithms and it takes less average time to achieve the signal recovery than these existing methods.
We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.
This paper addresses the color image completion problem in accordance with low-rank quatenrion matrix optimization that is characterized by sparse regularization in a transformed domain. This research was inspired by an appreciation of the fact that different signal types, including audio formats and images, possess structures that are inherently sparse in respect of their respective bases. Since color images can be processed as a whole in the quaternion domain, we depicted the sparsity of the color image in the quaternion discrete cosine transform (QDCT) domain. In addition, the representation of a low-rank structure that is intrinsic to the color image is a vital issue in the quaternion matrix completion problem. To achieve a more superior low-rank approximation, the quatenrion-based truncated nuclear norm (QTNN) is employed in the proposed model. Moreover, this model is facilitated by a competent alternating direction method of multipliers (ADMM) based on the algorithm. Extensive experimental results demonstrate that the proposed method can yield vastly superior completion performance in comparison with the state-of-the-art low-rank matrix/quaternion matrix approximation methods tested on color image recovery.
In the interdependent values (IDV) model introduced by Milgrom and Weber [1982], agents have private signals that capture their information about different social alternatives, and the valuation of every agent is a function of all agent signals. While interdependence has been mainly studied for auctions, it is extremely relevant for a large variety of social choice settings, including the canonical setting of public projects. The IDV model is very challenging relative to standard independent private values, and welfare guarantees have been achieved through two alternative conditions known as {\em single-crossing} and {\em submodularity over signals (SOS)}. In either case, the existing theory falls short of solving the public projects setting. Our contribution is twofold: (i) We give a workable characterization of truthfulness for IDV public projects for the largest class of valuations for which such a characterization exists, and term this class \emph{decomposable valuations}; (ii) We provide possibility and impossibility results for welfare approximation in public projects with SOS valuations. Our main impossibility result is that, in contrast to auctions, no universally truthful mechanism performs better for public projects with SOS valuations than choosing a project at random. Our main positive result applies to {\em excludable} public projects with SOS, for which we establish a constant factor approximation similar to auctions. Our results suggest that exclusion may be a key tool for achieving welfare guarantees in the IDV model.
How to recover a probability measure with sparse support from particular moments? This problem has been the focus of research in theoretical computer science and neural computing. However, there is no polynomial-time algorithm for the recovery. The best algorithm for the recovery requires $O(2^{\text{poly}(1/\epsilon)})$ for $\epsilon$-accurate recovery. We propose the first poly-time recovery method from carefully designed moments that only requires $O(\log(1/\epsilon)/\epsilon^2)$ computations for an $\epsilon$-accurate recovery. This method relies on the recovery of a planted two-layer neural network with two-dimensional inputs, a finite width, and zero-one activation. For such networks, we establish the first global convergence of gradient descent and demonstrate its application in sparse measure recovery.
We propose a new fast algorithm to estimate any sparse generalized linear model with convex or non-convex separable penalties. Our algorithm is able to solve problems with millions of samples and features in seconds, by relying on coordinate descent, working sets and Anderson acceleration. It handles previously unaddressed models, and is extensively shown to improve state-of-art algorithms. We provide a flexible, scikit-learn compatible package, which easily handles customized datafits and penalties.
A High-dimensional and sparse (HiDS) matrix is frequently encountered in a big data-related application like an e-commerce system or a social network services system. To perform highly accurate representation learning on it is of great significance owing to the great desire of extracting latent knowledge and patterns from it. Latent factor analysis (LFA), which represents an HiDS matrix by learning the low-rank embeddings based on its observed entries only, is one of the most effective and efficient approaches to this issue. However, most existing LFA-based models perform such embeddings on a HiDS matrix directly without exploiting its hidden graph structures, thereby resulting in accuracy loss. To address this issue, this paper proposes a graph-incorporated latent factor analysis (GLFA) model. It adopts two-fold ideas: 1) a graph is constructed for identifying the hidden high-order interaction (HOI) among nodes described by an HiDS matrix, and 2) a recurrent LFA structure is carefully designed with the incorporation of HOI, thereby improving the representa-tion learning ability of a resultant model. Experimental results on three real-world datasets demonstrate that GLFA outperforms six state-of-the-art models in predicting the missing data of an HiDS matrix, which evidently supports its strong representation learning ability to HiDS data.
Leveraging line features to improve localization accuracy of point-based visual-inertial SLAM (VINS) is gaining interest as they provide additional constraints on scene structure. However, real-time performance when incorporating line features in VINS has not been addressed. This paper presents PL-VINS, a real-time optimization-based monocular VINS method with point and line features, developed based on the state-of-the-art point-based VINS-Mono \cite{vins}. We observe that current works use the LSD \cite{lsd} algorithm to extract line features; however, LSD is designed for scene shape representation instead of the pose estimation problem, which becomes the bottleneck for the real-time performance due to its high computational cost. In this paper, a modified LSD algorithm is presented by studying a hidden parameter tuning and length rejection strategy. The modified LSD can run at least three times as fast as LSD. Further, by representing space lines with the Pl\"{u}cker coordinates, the residual error in line estimation is modeled in terms of the point-to-line distance, which is then minimized by iteratively updating the minimum four-parameter orthonormal representation of the Pl\"{u}cker coordinates. Experiments in a public benchmark dataset show that the localization error of our method is 12-16\% less than that of VINS-Mono at the same pose update frequency. %For the benefit of the community, The source code of our method is available at: //github.com/cnqiangfu/PL-VINS.
In this paper we propose a novel sparse optical flow (SOF)-based line feature tracking method for the camera pose estimation problem. This method is inspired by the point-based SOF algorithm and developed based on an observation that two adjacent images in time-varying image sequences satisfy brightness invariant. Based on this observation, we re-define the goal of line feature tracking: track two endpoints of a line feature instead of the entire line based on gray value matching instead of descriptor matching. To achieve this goal, an efficient two endpoint tracking (TET) method is presented: first, describe a given line feature with its two endpoints; next, track the two endpoints based on SOF to obtain two new tracked endpoints by minimizing a pixel-level grayscale residual function; finally, connect the two tracked endpoints to generate a new line feature. The correspondence is established between the given and the new line feature. Compared with current descriptor-based methods, our TET method needs not to compute descriptors and detect line features repeatedly. Naturally, it has an obvious advantage over computation. Experiments in several public benchmark datasets show our method yields highly competitive accuracy with an obvious advantage over speed.
Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have only been made on the problem within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude, while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: when you are having trouble optimizing, just guess.