The rigidity of a matrix $A$ for target rank $r$ is the minimum number of entries of $A$ that need to be changed in order to obtain a matrix of rank at most $r$. At MFCS'77, Valiant introduced matrix rigidity as a tool to prove circuit lower bounds for linear functions and since then this notion received much attention and found applications in other areas of complexity theory. The problem of constructing an explicit family of matrices that are sufficiently rigid for Valiant's reduction (Valiant-rigid) still remains open. Moreover, since 2017 most of the long-studied candidates have been shown not to be Valiant-rigid. Some of those former candidates for rigidity are Kronecker products of small matrices. In a recent paper (STOC'21), Alman gave a general non-rigidity result for such matrices: he showed that if an $n\times n$ matrix $A$ (over any field) is a Kronecker product of $d\times d$ matrices $M_1,\dots, M_k$ (so $n=d^k$) $(d\ge 2)$ then changing only $n^{1+\varepsilon}$ entries of $A$ one can reduce its rank to $\le n^{1-\gamma}$, where $1/\gamma$ is roughly $2^d/\varepsilon^2$. In this note we improve this result in two directions. First, we do not require the matrices $M_i$ to have equal size. Second, we reduce $1/\gamma$ from exponential in $d$ to roughly $d^{3/2}/\varepsilon^2$ (where $d$ is the maximum size of the matrices $M_i$), and to nearly linear (roughly $d/\varepsilon^2$) for matrices $M_i$ of sizes within a constant factor of each other. As an application of our results we significantly expand the class of Hadamard matrices that are known not to be Valiant-rigid; these now include the Kronecker products of Paley-Hadamard matrices and Hadamard matrices of bounded size.
For optimal control problems constrained by a initial-valued parabolic PDE, we have to solve a large scale saddle point algebraic system consisting of considering the discrete space and time points all together. A popular strategy to handle such a system is the Krylov subspace method, for which an efficient preconditioner plays a crucial role. The matching-Schur-complement preconditioner has been extensively studied in literature and the implementation of this preconditioner lies in solving the underlying PDEs twice, sequentially in time. In this paper, we propose a new preconditioner for the Schur complement, which can be used parallel-in-time (PinT) via the so called diagonalization technique. We show that the eigenvalues of the preconditioned matrix are low and upper bounded by positive constants independent of matrix size and the regularization parameter. The uniform boundedness of the eigenvalues leads to an optimal linear convergence rate of conjugate gradient solver for the preconditioned Schur complement system. To the best of our knowledge, it is the first time to have an optimal convergence analysis for a PinT preconditioning technique of the optimal control problem. Numerical results are reported to show that the performance of the proposed preconditioner is robust with respect to the discretization step-sizes and the regularization parameter.
We study the mixing time of the Metropolis-adjusted Langevin algorithm (MALA) for sampling from a log-smooth and strongly log-concave distribution. We establish its optimal minimax mixing time under a warm start. Our main contribution is two-fold. First, for a $d$-dimensional log-concave density with condition number $\kappa$, we show that MALA with a warm start mixes in $\tilde O(\kappa \sqrt{d})$ iterations up to logarithmic factors. This improves upon the previous work on the dependency of either the condition number $\kappa$ or the dimension $d$. Our proof relies on comparing the leapfrog integrator with the continuous Hamiltonian dynamics, where we establish a new concentration bound for the acceptance rate. Second, we prove a spectral gap based mixing time lower bound for reversible MCMC algorithms on general state spaces. We apply this lower bound result to construct a hard distribution for which MALA requires at least $\tilde \Omega (\kappa \sqrt{d})$ steps to mix. The lower bound for MALA matches our upper bound in terms of condition number and dimension. Finally, numerical experiments are included to validate our theoretical results.
In this paper, we shed new light on the spectrum of relation algebra $32_{65}$. We show that 1024 is in the spectrum, and no number smaller than 26 is in the spectrum. In addition, we derive upper and lower bounds on the smallest member of the spectra of an infinite class of algebras derived from $32_{65}$ via splitting.
The insertion-deletion codes were motivated to correct the synchronization errors. In this paper we prove several coordinate-ordering-free upper bounds on the insdel distances of linear codes, which are based on the generalized Hamming weights and the formation of minimum Hamming weight codewords. Our bounds are stronger than some previous known bounds. We apply these upper bounds to some cyclic codes and one algebraic-geometric code with any rearrangement of coordinate positions. Some strong upper bounds on the insdel distances of Reed-Muller codes with special coordinate-ordering are also given.
In this work, we propose a reduced basis method for efficient solution of parametric linear systems. The coefficient matrix is assumed to be a linear matrix-valued function that is symmetric and positive definite for admissible values of the parameter $\mathbf{\sigma}\in \mathbb{R}^s$. We propose a solution strategy where one first computes a basis for the appropriate compound Krylov subspace and then uses this basis to compute a subspace solution for multiple $\mathbf{\sigma}$. Three kinds of compound Krylov subspaces are discussed. Error estimate is given for the subspace solution from each of these spaces. Theoretical results are demonstrated by numerical examples related to solving parameter dependent elliptic PDEs using the finite element method (FEM).
We obtain explicit $p$-Wasserstein distance error bounds between the distribution of the multi-parameter MLE and the multivariate normal distribution. Our general bounds are given for possibly high-dimensional, independent and identically distributed random vectors. Our general bounds are of the optimal $\mathcal{O}(n^{-1/2})$ order. Explicit numerical constants are given when $p\in(1,2]$, and in the case $p>2$ the bounds are explicit up to a constant factor that only depends on $p$. We apply our general bounds to derive Wasserstein distance error bounds for the multivariate normal approximation of the MLE in several settings; these being single-parameter exponential families, the normal distribution under canonical parametrisation, and the multivariate normal distribution under non-canonical parametrisation. In addition, we provide upper bounds with respect to the bounded Wasserstein distance when the MLE is implicitly defined.
We consider n robots with limited visibility: each robot can observe other robots only up to a constant distance denoted as the viewing range. The robots operate in discrete rounds that are either fully synchronous (FSync) or semi-synchronized (SSync). Most previously studied formation problems in this setting seek to bring the robots closer together (e.g., Gathering or Chain-Formation). In this work, we introduce the Max-Line-Formation problem, which has a contrary goal: to arrange the robots on a straight line of maximal length. First, we prove that the problem is impossible to solve by robots with a constant sized circular viewing range. The impossibility holds under comparably strong assumptions: robots that agree on both axes of their local coordinate systems in FSync. On the positive side, we show that the problem is solvable by robots with a constant square viewing range, i.e., the robots can observe other robots that lie within a constant-sized square centered at their position. In this case, the robots need to agree on only one axis of their local coordinate systems. We derive two algorithms: the first algorithm considers oblivious robots and converges to the optimal configuration in time $\mathcal{O}(n^2 \cdot \log (n/\varepsilon))$ under the SSync scheduler. The other algorithm makes use of locally visible lights (LUMI). It is designed for the FSync scheduler and can solve the problem exactly in optimal time $\Theta(n)$. Afterward, we show that both the algorithmic and the analysis techniques can also be applied to the Gathering and Chain-Formation problem: we introduce an algorithm with a reduced viewing range for Gathering and give new and improved runtime bounds for the Chain-Formation problem.
In this paper, we consider the multi-armed bandit problem with high-dimensional features. First, we prove a minimax lower bound, $\mathcal{O}\big((\log d)^{\frac{\alpha+1}{2}}T^{\frac{1-\alpha}{2}}+\log T\big)$, for the cumulative regret, in terms of horizon $T$, dimension $d$ and a margin parameter $\alpha\in[0,1]$, which controls the separation between the optimal and the sub-optimal arms. This new lower bound unifies existing regret bound results that have different dependencies on T due to the use of different values of margin parameter $\alpha$ explicitly implied by their assumptions. Second, we propose a simple and computationally efficient algorithm inspired by the general Upper Confidence Bound (UCB) strategy that achieves a regret upper bound matching the lower bound. The proposed algorithm uses a properly centered $\ell_1$-ball as the confidence set in contrast to the commonly used ellipsoid confidence set. In addition, the algorithm does not require any forced sampling step and is thereby adaptive to the practically unknown margin parameter. Simulations and a real data analysis are conducted to compare the proposed method with existing ones in the literature.
Historically, to bound the mean for small sample sizes, practitioners have had to choose between using methods with unrealistic assumptions about the unknown distribution (e.g., Gaussianity) and methods like Hoeffding's inequality that use weaker assumptions but produce much looser (wider) intervals. In 1969, Anderson (1969) proposed a mean confidence interval strictly better than or equal to Hoeffding's whose only assumption is that the distribution's support is contained in an interval $[a,b]$. For the first time since then, we present a new family of bounds that compares favorably to Anderson's. We prove that each bound in the family has {\em guaranteed coverage}, i.e., it holds with probability at least $1-\alpha$ for all distributions on an interval $[a,b]$. Furthermore, one of the bounds is tighter than or equal to Anderson's for all samples. In simulations, we show that for many distributions, the gain over Anderson's bound is substantial.
We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.