Consider a matroid where all elements are labeled with an element from $\mathbb{Z}_m$. We are interested in finding a base where the sum of the labels is congruent to $g \pmod m$. We show that this problem can be solved in $O(m^{2m} n r \log r)$ time for a matroid with $n$ elements and rank $r$, when $m$ is either the product of two primes or a prime power. The algorithm generalizes to all moduli, and in fact, to all abelian groups, if a classic additive combinatorics conjecture of Schrijver and Seymour is true. We also discuss the optimization version of the problem.
This work concerns elementwise-transformations of spiked matrices: $Y_n = n^{-1/2} f( \sqrt{n} X_n + Z_n)$. Here, $f$ is a function applied elementwise, $X_n$ is a low-rank signal matrix, and $Z_n$ is white noise. We find that principal component analysis is powerful for recovering signal under highly nonlinear or discontinuous transformations. Specifically, in the high-dimensional setting where $Y_n$ is of size $n \times p$ with $n,p \rightarrow \infty$ and $p/n \rightarrow \gamma > 0$, we uncover a phase transition: for signal-to-noise ratios above a sharp threshold -- depending on $f$, the distribution of elements of $Z_n$, and the limiting aspect ratio $\gamma$ -- the principal components of $Y_n$ (partially) recover those of $X_n$. Below this threshold, the principal components of $Y_n$ are asymptotically orthogonal to the signal. In contrast, in the standard setting where $X_n + n^{-1/2}Z_n$ is observed directly, the analogous phase transition depends only on $\gamma$. A similar phenomenon occurs with $X_n$ square and symmetric and $Z_n$ a generalized Wigner matrix.
We consider the performance of a least-squares regression model, as judged by out-of-sample $R^2$. Shapley values give a fair attribution of the performance of a model to its input features, taking into account interdependencies between features. Evaluating the Shapley values exactly requires solving a number of regression problems that is exponential in the number of features, so a Monte Carlo-type approximation is typically used. We focus on the special case of least-squares regression models, where several tricks can be used to compute and evaluate regression models efficiently. These tricks give a substantial speed up, allowing many more Monte Carlo samples to be evaluated, achieving better accuracy. We refer to our method as least-squares Shapley performance attribution (LS-SPA), and describe our open-source implementation.
We consider the time and space required for quantum computers to solve a wide variety of problems involving matrices, many of which have only been analyzed classically in prior work. Our main results show that for a range of linear algebra problems -- including matrix-vector product, matrix inversion, matrix multiplication and powering -- existing classical time-space tradeoffs, several of which are tight for every space bound, also apply to quantum algorithms. For example, for almost all matrices $A$, including the discrete Fourier transform (DFT) matrix, we prove that quantum circuits with at most $T$ input queries and $S$ qubits of memory require $T=\Omega(n^2/S)$ to compute matrix-vector product $Ax$ for $x \in \{0,1\}^n$. We similarly prove that matrix multiplication for $n\times n$ binary matrices requires $T=\Omega(n^3 / \sqrt{S})$. Because many of our lower bounds match deterministic algorithms with the same time and space complexity, we show that quantum computers cannot provide any asymptotic advantage for these problems with any space bound. We obtain matching lower bounds for the stronger notion of quantum cumulative memory complexity -- the sum of the space per layer of a circuit. We also consider Boolean (i.e. AND-OR) matrix multiplication and matrix-vector products, improving the previous quantum time-space tradeoff lower bounds for $n\times n$ Boolean matrix multiplication to $T=\Omega(n^{2.5}/S^{1/3})$ from $T=\Omega(n^{2.5}/S^{1/2})$. Our improved lower bound for Boolean matrix multiplication is based on a new coloring argument that extracts more from the strong direct product theorem used in prior work. Our tight lower bounds for linear algebra problems require adding a new bucketing method to the recording-query technique of Zhandry that lets us apply classical arguments to upper bound the success probability of quantum circuits.
Gaussian processes (GPs) are sophisticated distributions to model functional data. Whilst theoretically appealing, they are computationally cumbersome except for small datasets. We implement two methods for scaling GP inference in Stan: First, a general sparse approximation using a directed acyclic dependency graph; second, a fast, exact method for regularly spaced data modeled by GPs with stationary kernels using the fast Fourier transform. Based on benchmark experiments, we offer guidance for practitioners to decide between different methods and parameterizations. We consider two real-world examples to illustrate the package. The implementation follows Stan's design and exposes performant inference through a familiar interface. Full posterior inference for ten thousand data points is feasible on a laptop in less than 20 seconds. Details on how to get started using the popular interfaces cmdstanpy for Python and cmdstanr for R are provided.
Standard multidimensional scaling takes as input a dissimilarity matrix of general term $\delta _{ij}$ which is a numerical value. In this paper we input $\delta _{ij}=[\underline{\delta _{ij}},\overline{\delta _{ij}}]$ where $\underline{\delta _{ij}}$ and $\overline{\delta _{ij}}$ are the lower bound and the upper bound of the ``dissimilarity'' between the stimulus/object $S_i$ and the stimulus/object $S_j$ respectively. As output instead of representing each stimulus/object on a factorial plane by a point, as in other multidimensional scaling methods, in the proposed method each stimulus/object is visualized by a rectangle, in order to represent dissimilarity variation. We generalize the classical scaling method looking for a method that produces results similar to those obtained by Tops Principal Components Analysis. Two examples are presented to illustrate the effectiveness of the proposed method.
Minimum sum vertex cover of an $n$-vertex graph $G$ is a bijection $\phi : V(G) \to [n]$ that minimizes the cost $\sum_{\{u,v\} \in E(G)} \min \{\phi(u), \phi(v) \}$. Finding a minimum sum vertex cover of a graph (the MSVC problem) is NP-hard. MSVC is studied well in the realm of approximation algorithms. The best-known approximation factor in polynomial time for the problem is $16/9$ [Bansal, Batra, Farhadi, and Tetali, SODA 2021]. Recently, Stankovic [APPROX/RANDOM 2022] proved that achieving an approximation ratio better than $1.014$ for MSVC is NP-hard, assuming the Unique Games Conjecture. We study the MSVC problem from the perspective of parameterized algorithms. The parameters we consider are the size of a minimum vertex cover and the size of a minimum clique modulator of the input graph. We obtain the following results. 1. MSVC can be solved in $2^{2^{O(k)}} n^{O(1)}$ time, where $k$ is the size of a minimum vertex cover. 2. MSVC can be solved in $f(k)\cdot n^{O(1)}$ time for some computable function $f$, where $k$ is the size of a minimum clique modulator.
Deep equilibrium (DEQ) models have emerged as a promising class of implicit layer models, which abandon traditional depth by solving for the fixed points of a single nonlinear layer. Despite their success, the stability of the fixed points for these models remains poorly understood. By considering DEQ models as nonlinear dynamic systems, we propose a robust DEQ model named LyaDEQ with guaranteed provable stability via Lyapunov theory. The crux of our method is ensuring the Lyapunov stability of the DEQ model's fixed points, which enables the proposed model to resist minor initial perturbations. To avoid poor adversarial defense due to Lyapunov-stable fixed points being located near each other, we orthogonalize the layers after the Lyapunov stability module to separate different fixed points. We evaluate LyaDEQ models under well-known adversarial attacks, and experimental results demonstrate significant improvement in robustness. Furthermore, we show that the LyaDEQ model can be combined with other defense methods, such as adversarial training, to achieve even better adversarial robustness.
Feature bagging is a well-established ensembling method which aims to reduce prediction variance by combining predictions of many estimators trained on subsets or projections of features. Here, we develop a theory of feature-bagging in noisy least-squares ridge ensembles and simplify the resulting learning curves in the special case of equicorrelated data. Using analytical learning curves, we demonstrate that subsampling shifts the double-descent peak of a linear predictor. This leads us to introduce heterogeneous feature ensembling, with estimators built on varying numbers of feature dimensions, as a computationally efficient method to mitigate double-descent. Then, we compare the performance of a feature-subsampling ensemble to a single linear predictor, describing a trade-off between noise amplification due to subsampling and noise reduction due to ensembling. Our qualitative insights carry over to linear classifiers applied to image classification tasks with realistic datasets constructed using a state-of-the-art deep learning feature map.
We propose and analyze a class of meshfree, super-algebraically convergent methods for partial differential equations (PDEs) on surfaces using Fourier extensions minimizing a measure of non-smoothness (such as a Sobolev norm). Current spectral methods for surface PDEs are primarily limited to a small class of surfaces, such as subdomains of spheres. Other high order methods for surface PDEs typically use radial basis functions (RBFs). Many of these methods are not well-understood analytically for surface PDEs and are highly ill-conditioned. Our methods work by extending a surface PDE into a box-shaped domain so that differential operators of the extended function agree with the surface differential operators, as in the Closest Point Method. The methods can be proven to converge super-algebraically for certain well-posed linear PDEs, and spectral convergence to machine error has been observed numerically for a variety of problems. Our approach works on arbitrary smooth surfaces (closed or non-closed) defined by point clouds with minimal conditions.
Existing structural analysis methods may fail to find all hidden constraints for a system of differential-algebraic equations with parameters if the system is structurally unamenable for certain values of the parameters. In this paper, for polynomial systems of differential-algebraic equations, numerical methods are given to solve such cases using numerical real algebraic geometry. First, we propose an embedding method that for a given real analytic system constructs an equivalent system with a full-rank Jacobian matrix. Secondly, we introduce a witness point method, which can help to detect degeneration on all components of constraints of such systems. Thirdly, the two methods above lead to a numerical global structural analysis method for structurally unamenable differential-algebraic equations on all components of constraints.