We consider the following task: suppose an algorithm is given copies of an unknown $n$-qubit quantum state $|\psi\rangle$ promised $(i)$ $|\psi\rangle$ is $\varepsilon_1$-close to a stabilizer state in fidelity or $(ii)$ $|\psi\rangle$ is $\varepsilon_2$-far from all stabilizer states, decide which is the case. We give a $\textsf{poly}(1/\varepsilon_1)$-sample and $n\cdot \textsf{poly}(1/\varepsilon_1)$-time algorithm for this task for every $\varepsilon_1>0$ and $\varepsilon_2\leq 2^{-\textsf{poly}(1/\varepsilon_1)}$. Our proof includes a new definition of Gowers norm for quantum states, an inverse theorem for the Gowers-$3$ norm of states and new bounds on stabilizer covering for structured subsets of Paulis using results in additive combinatorics.
Image Edge detection (ED) is a base task in computer vision. While the performance of the ED algorithm has been improved greatly by introducing CNN-based models, current models still suffer from unsatisfactory precision rates especially when only a low error toleration distance is allowed. Therefore, model architecture for more precise predictions still needs an investigation. On the other hand, the unavoidable noise training data provided by humans would lead to unsatisfactory model predictions even when inputs are edge maps themselves, which also needs a solution. In this paper, more precise ED models are presented with cascaded skipping density blocks (CSDB). Our models obtain state-of-the-art(SOTA) predictions in several datasets, especially in average precision rate (AP), over a high-standard benchmark, which is confirmed by extensive experiments. Also, a novel modification on data augmentation for training is employed, which allows noiseless data to be employed in model training for the first time, and thus further improves the model performance. The relative Python codes can be found on //github.com/Hao-B-Shu/SDPED.
We prove that the constructive and intuitionistic variants of the modal logic $\mathsf{KB}$ coincide. This result contrasts with a recent result by Das and Marin, who showed that the constructive and intuitionistic variants of $\mathsf{K}$ do not prove the same diamond-free formulas.
The Increasing Population Covariance Matrix Adaptation Evolution Strategy (IPOP-CMA-ES) algorithm is a reference stochastic optimizer dedicated to blackbox optimization, where no prior knowledge about the underlying problem structure is available. This paper aims at accelerating IPOP-CMA-ES thanks to high performance computing and parallelism when solving large optimization problems. We first show how BLAS and LAPACK routines can be introduced in linear algebra operations, and we then propose two strategies for deploying IPOP-CMA-ES efficiently on large-scale parallel architectures with thousands of CPU cores. The first parallel strategy processes the multiple searches in the same ordering as the sequential IPOP-CMA-ES, while the second one processes concurrently these multiple searches. These strategies are implemented in MPI+OpenMP and compared on 6144 cores of the supercomputer Fugaku. We manage to obtain substantial speedups (up to several thousand) and even super-linear ones, and we provide an in-depth analysis of our results to understand precisely the superior performance of our second strategy.
We introduce the concept of an imprecise Markov semigroup $\mathbf{Q}$. It is a tool that allows to represent ambiguity around both the initial and the transition probabilities of a Markov process via a compact collection of plausible Markov semigroups, each associated with a (different, plausible) Markov process. We use techniques from geometry, functional analysis, and (high dimensional) probability to study the ergodic behavior of $\mathbf{Q}$. We show that, if the initial distribution of the Markov processes associated with the elements of $\mathbf{Q}$ is known and invariant, under some conditions that also involve the geometry of the state space, eventually the ambiguity around their transition probability fades. We call this property ergodicity of the imprecise Markov semigroup, and we relate it to the classical notion of ergodicity. We prove ergodicity both when the state space is Euclidean or a Riemannian manifold, and when it is an arbitrary measurable space. The importance of our findings for the fields of machine learning and computer vision is also discussed.
This paper studies the extreme singular values of non-harmonic Fourier matrices. Such a matrix of size $m\times s$ can be written as $\Phi=[ e^{-2\pi i j x_k}]_{j=0,1,\dots,m-1, k=1,2,\dots,s}$ for some set $\mathcal{X}=\{x_k\}_{k=1}^s$. Its condition number controls the stability of inversion, which is of great importance to super-resolution and nonuniform Fourier transforms. Under the assumption $m\geq 6s$ and without any restrictions on $\mathcal{X}$, the main theorems provide explicit lower bounds for the smallest singular value $\sigma_s(\Phi)$ in terms of distances between elements in $\mathcal{X}$. More specifically, distances exceeding an appropriate scale $\tau$ have modest influence on $\sigma_s(\Phi)$, while the product of distances that are less than $\tau$ dominates the behavior of $\sigma_s(\Phi)$. These estimates reveal how the multiscale structure of $\mathcal{X}$ affects the condition number of Fourier matrices. Theoretical and numerical comparisons indicate that the main theorems significantly improve upon classical bounds and recover the same rate for special cases but with relaxed assumptions.
A singularly perturbed reaction-diffusion problem posed on the unit square in $\mathbb{R}^2$ is solved numerically by a local discontinuous Galerkin (LDG) finite element method. Typical solutions of this class of 2D problems exhibit boundary layers along the sides of the domain; these layers generally cause difficulties for numerical methods. Our LDG method handles the boundary layers by using a Shishkin mesh and also introducing the new concept of a ``layer-upwind flux" -- a discrete flux whose values are chosen on the fine mesh (which lies inside the boundary layers) in the direction where the layer weakens. On the coarse mesh, one can use a standard central flux. No penalty terms are needed with these fluxes, unlike many other variants of the LDG method. Our choice of discrete flux makes it feasible to derive an optimal-order error analysis in a balanced norm; this norm is stronger than the usual energy norm and is a more appropriate measure for errors in computed solutions for singularly perturbed reaction-diffusion problems. It will be proved that the LDG method is usually convergent of order $O((N^{-1}\ln N)^{k+1})$ in the balanced norm, where $N$ is the number of mesh intervals in each coordinate direction and tensor-product piecewise polynomials of degree~$k$ in each coordinate variable are used in the LDG method. This result is the first of its kind for the LDG method applied to this class of problem and is optimal for convergence on a Shishkin mesh. Its sharpness is confirmed by numerical experiments.
We consider the problem of counting 4-cycles ($C_4$) in an undirected graph $G$ of $n$ vertices and $m$ edges (in bipartite graphs, 4-cycles are also often referred to as $\textit{butterflies}$). Most recently, Wang et al. (2019, 2022) developed algorithms for this problem based on hash tables and sorting the graph by degree. Their algorithm takes $O(m\bar\delta)$ expected time and $O(m)$ space, where $\bar \delta \leq O(\sqrt{m})$ is the $\textit{average degeneracy}$ parameter introduced by Burkhardt, Faber \& Harris (2020). We develop a streamlined version of this algorithm requiring $O(m\bar\delta)$ time and precisely $n$ words of space. It has several practical improvements and optimizations; for example, it is fully deterministic, does not require any auxiliary storage or sorting of the input graph, and uses only addition and array access in its inner loops. Our algorithm is very simple and easily adapted to count 4-cycles incident to each vertex and edge. Empirical tests demonstrate that our array-based approach is $4\times$ -- $7\times$ faster on average compared to popular hash table implementations.
We propose a randomized lattice algorithm for approximating multivariate periodic functions over the $d$-dimensional unit cube from the weighted Korobov space with mixed smoothness $\alpha > 1/2$ and product weights $\gamma_1,\gamma_2,\ldots\in [0,1]$. Building upon the deterministic lattice algorithm by Kuo, Sloan, and Wo\'{z}niakowski (2006), we incorporate a randomized quadrature rule by Dick, Goda, and Suzuki (2022) to accelerate the convergence rate. This randomization involves drawing the number of points for function evaluations randomly, and selecting a good generating vector for rank-1 lattice points using the randomized component-by-component algorithm. We prove that our randomized algorithm achieves a worst-case root mean squared $L_2$-approximation error of order $M^{-\alpha/2 - 1/8 + \varepsilon}$ for an arbitrarily small $\varepsilon > 0$, where $M$ denotes the maximum number of function evaluations, and that the error bound is independent of the dimension $d$ if the weights satisfy $\sum_{j=1}^\infty \gamma_j^{1/\alpha} < \infty$. Our upper bound converges faster than a lower bound on the worst-case $L_2$-approximation error for deterministic rank-1 lattice-based approximation proved by Byrenheid, K\"{a}mmerer, Ullrich, and Volkmer (2017). We also show a lower error bound of order $M^{-\alpha/2-1/2}$ for our randomized algorithm, leaving a slight gap between the upper and lower bounds open for future research.
We present a new Krylov subspace recycling method for solving a linear system of equations, or a sequence of slowly changing linear systems. Our approach is to reduce the computational overhead of recycling techniques while still benefiting from the acceleration afforded by such techniques. As such, this method augments an unprojected Krylov subspace. Furthermore, it combines randomized sketching and deflated restarting in a way that avoids orthogononalizing a full Krylov basis. We call this new method GMRES-SDR (sketched deflated restarting). With this new method, we provide new theory, which initially characterizes unaugmented sketched GMRES as a projection method for which the projectors involve the sketching operator. We demonstrate that sketched GMRES and its sibling method sketched FOM are an MR/OR pairing, just like GMRES and FOM. We furthermore obtain residual convergence estimates. Building on this, we characterize GMRES-SDR also in terms of sketching-based projectors. Compression of the augmented Krylov subspace for recycling is performed using a sketched version of harmonic Ritz vectors. We present results of numerical experiments demonstrating the effectiveness of GMRES-SDR over competitor methods such as GMRES-DR and GCRO-DR.
Given a finite set of matrices with integer entries, the matrix mortality problem asks if there exists a product of these matrices equal to the zero matrix. We consider a special case of this problem where all entries of the matrices are nonnegative. This case is equivalent to the NFA mortality problem, which, given an NFA, asks for a word $w$ such that the image of every state under $w$ is the empty set. The size of the alphabet of the NFA is then equal to the number of matrices in the set. We study the length of shortest such words depending on the size of the alphabet. We show that for an NFA with $n$ states this length can be at least $2^n - 1$ for an alphabet of size $n$, $2^{(n - 4)/2}$ for an alphabet of size $3$ and $2^{(n - 2)/3}$ for an alphabet of size $2$. We also discuss further open problems related to mortality of NFAs and DFAs.