This paper presents a randomized quaternion singular value decomposition (QSVD) algorithm for low-rank matrix approximation problems, which are widely used in color face recognition, video compression, and signal processing problems. With quaternion normal distribution based random sampling, the randomized QSVD algorithm projects a high-dimensional data to a low-dimensional subspace and then identifies an approximate range subspace of the quaternion matrix. The key statistical properties of quaternion Wishart distribution are proposed and used to perform the approximation error analysis of the algorithm. Theoretical results show that the randomized QSVD algorithm can trace dominant singular value decomposition triplets of a quaternion matrix with acceptable accuracy. Numerical experiments also indicate the rationality of proposed theories. Applied to color face recognition problems, the randomized QSVD algorithm obtains higher recognition accuracies and behaves more efficient than the known Lanczos-based partial QSVD and a quaternion version of fast frequent directions algorithm.
This paper addresses the color image completion problem in accordance with low-rank quatenrion matrix optimization that is characterized by sparse regularization in a transformed domain. This research was inspired by an appreciation of the fact that different signal types, including audio formats and images, possess structures that are inherently sparse in respect of their respective bases. Since color images can be processed as a whole in the quaternion domain, we depicted the sparsity of the color image in the quaternion discrete cosine transform (QDCT) domain. In addition, the representation of a low-rank structure that is intrinsic to the color image is a vital issue in the quaternion matrix completion problem. To achieve a more superior low-rank approximation, the quatenrion-based truncated nuclear norm (QTNN) is employed in the proposed model. Moreover, this model is facilitated by a competent alternating direction method of multipliers (ADMM) based on the algorithm. Extensive experimental results demonstrate that the proposed method can yield vastly superior completion performance in comparison with the state-of-the-art low-rank matrix/quaternion matrix approximation methods tested on color image recovery.
Covariance estimation for matrix-valued data has received an increasing interest in applications. Unlike previous works that rely heavily on matrix normal distribution assumption and the requirement of fixed matrix size, we propose a class of distribution-free regularized covariance estimation methods for high-dimensional matrix data under a separability condition and a bandable covariance structure. Under these conditions, the original covariance matrix is decomposed into a Kronecker product of two bandable small covariance matrices representing the variability over row and column directions. We formulate a unified framework for estimating bandable covariance, and introduce an efficient algorithm based on rank one unconstrained Kronecker product approximation. The convergence rates of the proposed estimators are established, and the derived minimax lower bound shows our proposed estimator is rate-optimal under certain divergence regimes of matrix size. We further introduce a class of robust covariance estimators and provide theoretical guarantees to deal with heavy-tailed data. We demonstrate the superior finite-sample performance of our methods using simulations and real applications from a gridded temperature anomalies dataset and a S&P 500 stock data analysis.
The eigenvector-eigenvalue identity, formally named by Peter B. Denton, Stephen J. Parker, Terence Tao, Xining Zhang [Bull. Math. Amer. Soc., 2021], which is a basic and important identity in linear commutative algebra. In this paper, we extend the eigenvector-eigenvalue identity to the quaternion division ring, which is non-commutative. A version of eigenvector-eigenvalue identity for the quaternion matrix is established. Furthermore, we give a new method and algorithm to compute the eigenvectors from the right eigenvalues for the quaternion Hermitian matrix. A program is designed to realize the algorithm to compute the eigenvectors. An open problem ends the paper. Some examples show a good performance of the algorithm and the program.
We investigate the feature compression of high-dimensional ridge regression using the optimal subsampling technique. Specifically, based on the basic framework of random sampling algorithm on feature for ridge regression and the A-optimal design criterion, we first obtain a set of optimal subsampling probabilities. Considering that the obtained probabilities are uneconomical, we then propose the nearly optimal ones. With these probabilities, a two step iterative algorithm is established which has lower computational cost and higher accuracy. We provide theoretical analysis and numerical experiments to support the proposed methods. Numerical results demonstrate the decent performance of our methods.
SVD (singular value decomposition) is one of the basic tools of machine learning, allowing to optimize basis for a given matrix. However, sometimes we have a set of matrices $\{A_k\}_k$ instead, and would like to optimize a single common basis for them: find orthogonal matrices $U$, $V$, such that $\{U^T A_k V\}$ set of matrices is somehow simpler. For example DCT-II is orthonormal basis of functions commonly used in image/video compression - as discussed here, this kind of basis can be quickly automatically optimized for a given dataset. While also discussed gradient descent optimization might be computationally costly, there is proposed CSVD (common SVD): fast general approach based on SVD. Specifically, we choose $U$ as built of eigenvectors of $\sum_i (w_k)^q (A_k A_k^T)^p$ and $V$ of $\sum_k (w_k)^q (A_k^T A_k)^p$, where $w_k$ are their weights, $p,q>0$ are some chosen powers e.g. 1/2, optionally with normalization e.g. $A \to A - rc^T$ where $r_i=\sum_j A_{ij}, c_j =\sum_i A_{ij}$.
Low-rank matrix estimation under heavy-tailed noise is challenging, both computationally and statistically. Convex approaches have been proven statistically optimal but suffer from high computational costs, especially since robust loss functions are usually non-smooth. More recently, computationally fast non-convex approaches via sub-gradient descent are proposed, which, unfortunately, fail to deliver a statistically consistent estimator even under sub-Gaussian noise. In this paper, we introduce a novel Riemannian sub-gradient (RsGrad) algorithm which is not only computationally efficient with linear convergence but also is statistically optimal, be the noise Gaussian or heavy-tailed. Convergence theory is established for a general framework and specific applications to absolute loss, Huber loss, and quantile loss are investigated. Compared with existing non-convex methods, ours reveals a surprising phenomenon of dual-phase convergence. In phase one, RsGrad behaves as in a typical non-smooth optimization that requires gradually decaying stepsizes. However, phase one only delivers a statistically sub-optimal estimator which is already observed in the existing literature. Interestingly, during phase two, RsGrad converges linearly as if minimizing a smooth and strongly convex objective function and thus a constant stepsize suffices. Underlying the phase-two convergence is the smoothing effect of random noise to the non-smooth robust losses in an area close but not too close to the truth. Lastly, RsGrad is applicable for low-rank tensor estimation under heavy-tailed noise where a statistically optimal rate is attainable with the same phenomenon of dual-phase convergence, and a novel shrinkage-based second-order moment method is guaranteed to deliver a warm initialization. Numerical simulations confirm our theoretical discovery and showcase the superiority of RsGrad over prior methods.
In this paper we propose a method for wavelet denoising of signals contaminated with Gaussian noise when prior information about the $L^2$-energy of the signal is available. Assuming the independence model, according to which the wavelet coefficients are treated individually, we propose a simple, level dependent shrinkage rules that turn out to be $\Gamma$-minimax for a suitable class of priors. The proposed methodology is particularly well suited in denoising tasks when the signal-to-noise ratio is low, which is illustrated by simulations on the battery of standard test functions. Comparison to some standardly used wavelet shrinkage methods is provided.
Bayesian model selection provides a powerful framework for objectively comparing models directly from observed data, without reference to ground truth data. However, Bayesian model selection requires the computation of the marginal likelihood (model evidence), which is computationally challenging, prohibiting its use in many high-dimensional Bayesian inverse problems. With Bayesian imaging applications in mind, in this work we present the proximal nested sampling methodology to objectively compare alternative Bayesian imaging models for applications that use images to inform decisions under uncertainty. The methodology is based on nested sampling, a Monte Carlo approach specialised for model comparison, and exploits proximal Markov chain Monte Carlo techniques to scale efficiently to large problems and to tackle models that are log-concave and not necessarily smooth (e.g., involving l_1 or total-variation priors). The proposed approach can be applied computationally to problems of dimension O(10^6) and beyond, making it suitable for high-dimensional inverse imaging problems. It is validated on large Gaussian models, for which the likelihood is available analytically, and subsequently illustrated on a range of imaging problems where it is used to analyse different choices of dictionary and measurement model.
We study the robust matrix completion problem for the low-rank Hankel matrix, which detects the sparse corruptions caused by extreme outliers while we try to recover the original Hankel matrix from the partial observation. In this paper, we explore the convenient Hankel structure and propose a novel non-convex algorithm, coined Hankel Structured Gradient Descent (HSGD), for large-scale robust Hankel matrix completion problems. HSGD is highly computing- and sample-efficient compared to the state-of-the-arts. The recovery guarantee with a linear convergence rate has been established for HSGD under some mild assumptions. The empirical advantages of HSGD are verified on both synthetic datasets and real-world nuclear magnetic resonance signals.
The numerical solution of singular eigenvalue problems is complicated by the fact that small perturbations of the coefficients may have an arbitrarily bad effect on eigenvalue accuracy. However, it has been known for a long time that such perturbations are exceptional and standard eigenvalue solvers, such as the QZ algorithm, tend to yield good accuracy despite the inevitable presence of roundoff error. Recently, Lotz and Noferini quantified this phenomenon by introducing the concept of $\delta$-weak eigenvalue condition numbers. In this work, we consider singular quadratic eigenvalue problems and two popular linearizations. Our results show that a correctly chosen linearization increases $\delta$-weak eigenvalue condition numbers only marginally, justifying the use of these linearizations in numerical solvers also in the singular case. We propose a very simple but often effective algorithm for computing well-conditioned eigenvalues of a singular quadratic eigenvalue problems by adding small random perturbations to the coefficients. We prove that the eigenvalue condition number is, with high probability, a reliable criterion for detecting and excluding spurious eigenvalues created from the singular part.