An equivalent definition of hypermatrices is introduced. The matrix expression of hypermatrices is proposed. Using permutation matrices, the conversion of different matrix expressions is revealed. The various contracted products of hypermatrices are realized by semi-tensor products (STP) of matrices via matrix expressions of hypermatrices.
In this paper, we prove that functional sliced inverse regression (FSIR) achieves the optimal (minimax) rate for estimating the central space in functional sufficient dimension reduction problems. First, we provide a concentration inequality for the FSIR estimator of the covariance of the conditional mean, i.e., $\var(\E[\boldsymbol{X}\mid Y])$. Based on this inequality, we establish the root-$n$ consistency of the FSIR estimator of the image of $\var(\E[\boldsymbol{X}\mid Y])$. Second, we apply the most widely used truncated scheme to estimate the inverse of the covariance operator and identify the truncation parameter which ensures that FSIR can achieve the optimal minimax convergence rate for estimating the central space. Finally, we conduct simulations to demonstrate the optimal choice of truncation parameter and the estimation efficiency of FSIR. To the best of our knowledge, this is the first paper to rigorously prove the minimax optimality of FSIR in estimating the central space for multiple-index models and general $Y$ (not necessarily discrete).
Contract design involves a principal who establishes contractual agreements about payments for outcomes that arise from the actions of an agent. In this paper, we initiate the study of deep learning for the automated design of optimal contracts. We formulate this as an offline learning problem, where a deep network is used to represent the principal's expected utility as a function of the design of a contract. We introduce a novel representation: the Discontinuous ReLU (DeLU) network, which models the principal's utility as a discontinuous piecewise affine function where each piece corresponds to the agent taking a particular action. DeLU networks implicitly learn closed-form expressions for the incentive compatibility constraints of the agent and the utility maximization objective of the principal, and support parallel inference on each piece through linear programming or interior-point methods that solve for optimal contracts. We provide empirical results that demonstrate success in approximating the principal's utility function with a small number of training samples and scaling to find approximately optimal contracts on problems with a large number of actions and outcomes.
In 1982 Papadimitriou and Yannakakis introduced the Exact Matching problem, in which given a red and blue edge-colored graph $G$ and an integer $k$ one has to decide whether there exists a perfect matching in $G$ with exactly $k$ red edges. Even though a randomized polynomial-time algorithm for this problem was quickly found a few years later, it is still unknown today whether a deterministic polynomial-time algorithm exists. This makes the Exact Matching problem an important candidate to test the RP=P hypothesis. In this paper we focus on approximating Exact Matching. While there exists a simple algorithm that computes in deterministic polynomial-time an almost perfect matching with exactly $k$ red edges, not a lot of work focuses on computing perfect matchings with almost $k$ red edges. In fact such an algorithm for bipartite graphs running in deterministic polynomial-time was published only recently (STACS'23). It outputs a perfect matching with $k'$ red edges with the guarantee that $0.5k \leq k' \leq 1.5k$. In the present paper we aim at approximating the number of red edges without exceeding the limit of $k$ red edges. We construct a deterministic polynomial-time algorithm, which on bipartite graphs computes a perfect matching with $k'$ red edges such that $k/3 \leq k' \leq k$.
This paper examines the approximation of log-determinant for large-scale symmetric positive definite matrices. Inspired by the variance reduction technique, we split the approximation of $\log\det(A)$ into two parts. The first to compute is the trace of the projection of $\log(A)$ onto a suboptimal subspace, while the second is the trace of the projection on the corresponding orthogonal complementary space. For these two approximations, the stochastic Lanczos quadrature method is used. Furthermore, in the construction of the suboptimal subspace, we utilize a projection-cost-preserving sketch to bound the size of the Gaussian random matrix and the dimension of the suboptimal subspace. We provide a rigorous error analysis for our proposed method and explicit lower bounds for its design parameters, offering guidance for practitioners. We conduct numerical experiments to demonstrate our method's effectiveness and illustrate the quality of the derived bounds.
Synthetic time series are often used in practical applications to augment the historical time series dataset for better performance of machine learning algorithms, amplify the occurrence of rare events, and also create counterfactual scenarios described by the time series. Distributional-similarity (which we refer to as realism) as well as the satisfaction of certain numerical constraints are common requirements in counterfactual time series scenario generation requests. For instance, the US Federal Reserve publishes synthetic market stress scenarios given by the constrained time series for financial institutions to assess their performance in hypothetical recessions. Existing approaches for generating constrained time series usually penalize training loss to enforce constraints, and reject non-conforming samples. However, these approaches would require re-training if we change constraints, and rejection sampling can be computationally expensive, or impractical for complex constraints. In this paper, we propose a novel set of methods to tackle the constrained time series generation problem and provide efficient sampling while ensuring the realism of generated time series. In particular, we frame the problem using a constrained optimization framework and then we propose a set of generative methods including ``GuidedDiffTime'', a guided diffusion model to generate realistic time series. Empirically, we evaluate our work on several datasets for financial and energy data, where incorporating constraints is critical. We show that our approaches outperform existing work both qualitatively and quantitatively. Most importantly, we show that our ``GuidedDiffTime'' model is the only solution where re-training is not necessary for new constraints, resulting in a significant carbon footprint reduction.
In this paper, we study the problems of detection and recovery of hidden submatrices with elevated means inside a large Gaussian random matrix. We consider two different structures for the planted submatrices. In the first model, the planted matrices are disjoint, and their row and column indices can be arbitrary. Inspired by scientific applications, the second model restricts the row and column indices to be consecutive. In the detection problem, under the null hypothesis, the observed matrix is a realization of independent and identically distributed standard normal entries. Under the alternative, there exists a set of hidden submatrices with elevated means inside the same standard normal matrix. Recovery refers to the task of locating the hidden submatrices. For both problems, and for both models, we characterize the statistical and computational barriers by deriving information-theoretic lower bounds, designing and analyzing algorithms matching those bounds, and proving computational lower bounds based on the low-degree polynomials conjecture. In particular, we show that the space of the model parameters (i.e., number of planted submatrices, their dimensions, and elevated mean) can be partitioned into three regions: the impossible regime, where all algorithms fail; the hard regime, where while detection or recovery are statistically possible, we give some evidence that polynomial-time algorithm do not exist; and finally the easy regime, where polynomial-time algorithms exist.
In this paper, we devise a scheme for kernelizing, in sublinear space and polynomial time, various problems on planar graphs. The scheme exploits planarity to ensure that the resulting algorithms run in polynomial time and use O((sqrt(n) + k) log n) bits of space, where n is the number of vertices in the input instance and k is the intended solution size. As examples, we apply the scheme to Dominating Set and Vertex Cover. For Dominating Set, we also show that a well-known kernelization algorithm due to Alber et al. (JACM 2004) can be carried out in polynomial time and space O(k log n). Along the way, we devise restricted-memory procedures for computing region decompositions and approximating the aforementioned problems, which might be of independent interest.
We present two conjectures regarding the running time of computing symmetric factorizations for a Hankel matrix $\mathbf{H}$ and its inverse $\mathbf{H}^{-1}$ as $\mathbf{B}\mathbf{B}^*$ under fixed-point arithmetic. If solved, these would result in a faster-than-matrix-multiplication algorithm for solving sparse poly-conditioned linear programming problems, a fundamental problem in optimization and theoretical computer science. To justify our proposed conjectures and running times, we show weaker results of computing decompositions of the form $\mathbf{B}\mathbf{B}^* - \mathbf{C}\mathbf{C}^*$ for Hankel matrices and their inverses with the same running time. In addition, to promote our conjectures further, we discuss the connections of Hankel matrices and their symmetric factorizations to sum-of-squares (SoS) decompositions of single-variable polynomials.
Tensor ring (TR) decomposition is an efficient approach to discover the hidden low-rank patterns for higher-order tensors, and streaming tensors are becoming highly prevalent in real-world applications. In this paper, we investigate how to track TR decompositions of streaming tensors. An efficient algorithm is first proposed. Then, based on this algorithm and randomized techniques, we present a randomized streaming TR decomposition. The proposed algorithms make full use of the structure of TR decomposition, and the randomized version can allow any sketching type. Theoretical results on sketch size are provided. In addition, the complexity analyses for the obtained algorithms are also given. We compare our proposals with the existing batch methods using both real and synthetic data. Numerical results show that they have better performance in computing time with maintaining similar accuracy.
In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.