We study the formula complexity of the word problem $\mathsf{Word}_{S_n,k} : \{0,1\}^{kn^2} \to \{0,1\}$: given $n$-by-$n$ permutation matrices $M_1,\dots,M_k$, compute the $(1,1)$-entry of the matrix product $M_1\cdots M_k$. An important feature of this function is that it is invariant under action of $S_n^{k-1}$ given by \[ (\pi_1,\dots,\pi_{k-1})(M_1,\dots,M_k) = (M_1\pi_1^{-1},\pi_1M_2\pi_2^{-1},\dots,\pi_{k-2}M_{k-1}\pi_{k-1}^{-1},\pi_{k-1}M_k). \] This symmetry is also exhibited in the smallest known unbounded fan-in $\{\mathsf{AND},\mathsf{OR},\mathsf{NOT}\}$-formulas for $\mathsf{Word}_{S_n,k}$, which have size $n^{O(\log k)}$. In this paper we prove a matching $n^{\Omega(\log k)}$ lower bound for $S_n^{k-1}$-invariant formulas computing $\mathsf{Word}_{S_n,k}$. This result is motivated by the fact that a similar lower bound for unrestricted (non-invariant) formulas would separate complexity classes $\mathsf{NC}^1$ and $\mathsf{Logspace}$. Our more general main theorem gives a nearly tight $n^{d(k^{1/d}-1)}$ lower bound on the $G^{k-1}$-invariant depth-$d$ $\{\mathsf{MAJ},\mathsf{AND},\mathsf{OR},\mathsf{NOT}\}$-formula size of $\mathsf{Word}_{G,k}$ for any finite simple group $G$ whose minimum permutation representation has degree~$n$. We also give nearly tight lower bounds on the $G^{k-1}$-invariant depth-$d$ $\{\mathsf{AND},\mathsf{OR},\mathsf{NOT}\}$-formula size in the case where $G$ is an abelian group.
In this paper, we consider a general notion of convolution. Let $D$ be a finite domain and let $D^n$ be the set of $n$-length vectors (tuples) of $D$. Let $f : D \times D \to D$ be a function and let $\oplus_f$ be a coordinate-wise application of $f$. The $f$-Convolution of two functions $g,h : D^n \to \{-M,\ldots,M\}$ is $$(g \otimes_f h)(\textbf{v}) := \sum_{\substack{\textbf{v}_g,\textbf{v}_h \in D^n\\ \text{s.t. } \textbf{v}_g \oplus_f \textbf{v}_h}} g(\textbf{v}_g) \cdot h(\textbf{v}_h)$$ for every $\textbf{v} \in D^n$. This problem generalizes many fundamental convolutions such as Subset Convolution, XOR Product, Covering Product or Packing Product, etc. For arbitrary function $f$ and domain $D$ we can compute $f$-Convolution via brute-force enumeration in $\widetilde{O}(|D|^{2n}\mathrm{polylog}(M))$ time. Our main result is an improvement over this naive algorithm. We show that $f$-Convolution can be computed exactly in $\widetilde{O}((c \cdot |D|^2)^{n}\mathrm{polylog}(M))$ for constant $c := 3/4$ when $D$ has even cardinality. Our main observation is that a \emph{cyclic partition} of a function $f : D \times D \to D$ can be used to speed up the computation of $f$-Convolution, and we show that an appropriate cyclic partition exists for every $f$. Furthermore, we demonstrate that a single entry of the $f$-Convolution can be computed more efficiently. In this variant, we are given two functions $g,h : D^n \to \{-M,\ldots,M\}$ alongside with a vector $\textbf{v} \in D^n$ and the task of the $f$-Query problem is to compute integer $(g \otimes_f h)(\textbf{v})$. This is a generalization of the well-known Orthogonal Vectors problem. We show that $f$-Query can be computed in $\widetilde{O}(|D|^{\frac{\omega}{2} n}\mathrm{polylog}(M))$ time, where $\omega \in [2,2.372)$ is the exponent of currently fastest matrix multiplication algorithm.
Gaussian boson sampling, a computational model that is widely believed to admit quantum supremacy, has already been experimentally demonstrated to surpasses the classical simulation capabilities of even with the most powerful supercomputers today. However, whether the current approach limited by photon loss and noise in such experiments prescribes a scalable path to quantum advantage is an open question. For example, random circuit sampling with constant noise per gate was recently shown not to be a scalable approach to achieve quantum supremacy, although simulating intermediate scale systems is still difficult. To understand the effect of photon loss on the scability of Gaussian boson sampling, we use a tensor network algorithm with $U(1)$ symmetry to examine the asymptotic operator entanglement entropy scaling, which relates to the simulation complexity. We develop a custom-built algorithm that significantly reduces the computational time with state-of-the-art hardware accelerators, enabling simulations of much larger systems. With this capability, we observe, for Gaussian boson sampling, the crucial $N_\text{out}\propto\sqrt{N}$ scaling of the number of surviving photons in the number of input photons that marks the boundary between efficient and inefficient classical simulation. We further theoretically show that this should be general for other input states.
Distributed graph coloring is one of the most extensively studied problems in distributed computing. There is a canonical family of distributed graph coloring algorithms known as the locally-iterative coloring algorithms, first formalized in the seminal work of [Szegedy and Vishwanathan, STOC'93]. In such algorithms, every vertex iteratively updates its own color according to a predetermined function of the current coloring of its local neighborhood. Due to the simplicity and naturalness of its framework, locally-iterative coloring algorithms are of great significance both in theory and practice. In this paper, we give a locally-iterative $(\Delta+1)$-coloring algorithm with $O(\Delta^{3/4}\log\Delta)+\log^*n$ running time. This is the first locally-iterative $(\Delta+1)$-coloring algorithm with sublinear-in-$\Delta$ running time, and answers the main open question raised in a recent breakthrough [Barenboim, Elkin, and Goldberg, JACM'21]. A key component of our algorithm is a locally-iterative procedure that transforms an $O(\Delta^2)$-coloring to a $(\Delta+O(\Delta^{3/4}\log\Delta))$-coloring in $o(\Delta)$ time. Inside this procedure we work on special proper colorings that encode (arb)defective colorings, and reduce the number of used colors quadratically in a locally-iterative fashion. As a main application of our result, we also give a self-stabilizing distributed algorithm for $(\Delta+1)$-coloring with $O(\Delta^{3/4}\log\Delta)+\log^*n$ stabilization time. To the best of our knowledge, this is the first self-stabilizing algorithm for $(\Delta+1)$-coloring with sublinear-in-$\Delta$ stabilization time.
Solving high-dimensional partial differential equations is a recurrent challenge in economics, science and engineering. In recent years, a great number of computational approaches have been developed, most of them relying on a combination of Monte Carlo sampling and deep learning based approximation. For elliptic and parabolic problems, existing methods can broadly be classified into those resting on reformulations in terms of $\textit{backward stochastic differential equations}$ (BSDEs) and those aiming to minimize a regression-type $L^2$-error ($\textit{physics-informed neural networks}$, PINNs). In this paper, we review the literature and suggest a methodology based on the novel $\textit{diffusion loss}$ that interpolates between BSDEs and PINNs. Our contribution opens the door towards a unified understanding of numerical approaches for high-dimensional PDEs, as well as for implementations that combine the strengths of BSDEs and PINNs. The diffusion loss furthermore bears close similarities to $\textit{(least squares) temporal difference}$ objectives found in reinforcement learning. We also discuss eigenvalue problems and perform extensive numerical studies, including calculations of the ground state for nonlinear Schr\"odinger operators and committor functions relevant in molecular dynamics.
This paper presents a new strategy to deal with the excessive diffusion that standard finite volume methods for compressible Euler equations display in the limit of low Mach number. The strategy can be understood as using centered discretizations for the acoustic part of the Euler equations and stabilizing them with a leap-frog-type ("sequential explicit") time integration, a fully explicit method. This time integration takes inspiration from time-explicit staggered grid numerical methods. In this way, advantages of staggered methods carry over to collocated methods. The paper provides a number of new collocated schemes for linear acoustic/Maxwell equations that are inspired by the Yee scheme. They are then extended to an all-speed method for the full Euler equations on Cartesian grids. By taking the opposite view and taking inspiration from collocated methods, the paper also suggests a new way of staggering the variables which increases the stability as compared to the traditional Yee scheme.
We say that $\Gamma$, the boundary of a bounded Lipschitz domain, is locally dilation invariant if, at each $x\in \Gamma$, $\Gamma$ is either locally $C^1$ or locally coincides (in some coordinate system centred at $x$) with a Lipschitz graph $\Gamma_x$ such that $\Gamma_x=\alpha_x\Gamma_x$, for some $\alpha_x\in (0,1)$. In this paper we study, for such $\Gamma$, the essential spectrum of $D_\Gamma$, the double-layer (or Neumann-Poincar\'e) operator of potential theory, on $L^2(\Gamma)$. We show, via localisation and Floquet-Bloch-type arguments, that this essential spectrum %of $D_\Gamma$ %on such $\Gamma$ is the union of the spectra of related continuous families of operators $K_t$, for $t\in [-\pi,\pi]$; moreover, each $K_t$ is compact if $\Gamma$ is $C^1$ except at finitely many points. For the 2D case where, additionally, $\Gamma$ is piecewise analytic, we construct convergent sequences of approximations to the essential spectrum of $D_\Gamma$; each approximation is the union of the eigenvalues of finitely many finite matrices arising from Nystr\"om-method approximations to the operators $K_t$. Through error estimates with explicit constants, we also construct functionals that determine whether any particular locally-dilation-invariant piecewise-analytic $\Gamma$ satisfies the well-known spectral radius conjecture, that the essential spectral radius of $D_\Gamma$ on $L^2(\Gamma)$ is $<1/2$ for all Lipschitz $\Gamma$. We illustrate this theory with examples; for each we show that the essential spectral radius is $<1/2$, providing additional support for the conjecture. We also, via new results on the invariance of the essential spectral radius under locally-conformal $C^{1,\beta}$ diffeomorphisms, show that the spectral radius conjecture holds for all Lipschitz curvilinear polyhedra.
The hybrid high-order method is a modern numerical framework for the approximation of elliptic PDEs. We present here an extension of the hybrid high-order method to meshes possessing curved edges/faces. Such an extension allows us to enforce boundary conditions exactly on curved domains, and capture curved geometries that appear internally in the domain e.g. discontinuities in a diffusion coefficient. The method makes use of non-polynomial functions on the curved faces and does not require any mappings between reference elements/faces. Such an approach does not require the faces to be polynomial, and has a strict upper bound on the number of degrees of freedom on a curved face for a given polynomial degree. Moreover, this approach of enriching the space of unknowns on the curved faces with non-polynomial functions should extend naturally to other polytopal methods. We show the method to be stable and consistent on curved meshes and derive optimal error estimates in $L^2$ and energy norms. We present numerical examples of the method on a domain with curved boundary, and for a diffusion problem such that the diffusion tensor is discontinuous along a curved arc.
A dynamic graph algorithm is a data structure that answers queries about a property of the current graph while supporting graph modifications such as edge insertions and deletions. Prior work has shown strong conditional lower bounds for general dynamic graphs, yet graph families that arise in practice often exhibit structural properties that the existing lower bound constructions do not possess. We study three specific graph families that are ubiquitous, namely constant-degree graphs, power-law graphs, and expander graphs, and give the first conditional lower bounds for them. Our results show that even when restricting our attention to one of these graph classes, any algorithm for fundamental graph problems such as distance computation or approximation or maximum matching, cannot simultaneously achieve a sub-polynomial update time and query time. For example, we show that the same lower bounds as for general graphs hold for maximum matching and ($s,t$)-distance in constant-degree graphs, power-law graphs or expanders. Namely, in an $m$-edge graph, there exists no dynamic algorithms with both $O(m^{1/2 - \epsilon})$ update time and $ O(m^{1 -\epsilon})$ query time, for any small $\epsilon > 0$. Note that for ($s,t$)-distance the trivial dynamic algorithm achieves an almost matching upper bound of constant update time and $O(m)$ query time. We prove similar bounds for the other graph families and for other fundamental problems such as densest subgraph detection and perfect matching.
The geometric optimisation of crystal structures is a procedure widely used in Chemistry that changes the geometrical placement of the particles inside a structure. It is called structural relaxation and constitutes a local minimization problem with a non-convex objective function whose domain complexity increases along with the number of particles involved. In this work we study the performance of the two most popular first order optimisation methods, Gradient Descent and Conjugate Gradient, in structural relaxation. The respective pseudocodes can be found in Section 6. Although frequently employed, there is a lack of their study in this context from an algorithmic point of view. In order to accurately define the problem, we provide a thorough derivation of all necessary formulae related to the crystal structure energy function and the function's differentiation. We run each algorithm in combination with a constant step size, which provides a benchmark for the methods' analysis and direct comparison. We also design dynamic step size rules and study how these improve the two algorithms' performance. Our results show that there is a trade-off between convergence rate and the possibility of an experiment to succeed, hence we construct a function to assign utility to each method based on our respective preference. The function is built according to a recently introduced model of preference indication concerning algorithms with deadline and their run time. Finally, building on all our insights from the experimental results, we provide algorithmic recipes that best correspond to each of the presented preferences and select one recipe as the optimal for equally weighted preferences.
Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path in a large graph, neural networks allow learning from data to predict the most likely answer in more complex tasks such as image classification, which cannot be reduced to an exact algorithm. To get the best of both worlds, this thesis explores combining both concepts leading to more robust, better performing, more interpretable, more computationally efficient, and more data efficient architectures. The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm. When integrating an algorithm into a neural architecture, it is important that the algorithm is differentiable such that the architecture can be trained end-to-end and gradients can be propagated back through the algorithm in a meaningful way. To make algorithms differentiable, this thesis proposes a general method for continuously relaxing algorithms by perturbing variables and approximating the expectation value in closed form, i.e., without sampling. In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable renderers, and differentiable logic gate networks. Finally, this thesis presents alternative training strategies for learning with algorithms.