Probabilistic numerical solvers for ordinary differential equations (ODEs) treat the numerical simulation of dynamical systems as problems of Bayesian state estimation. Aside from producing posterior distributions over ODE solutions and thereby quantifying the numerical approximation error of the method itself, one less-often noted advantage of this formalism is the algorithmic flexibility gained by formulating numerical simulation in the framework of Bayesian filtering and smoothing. In this paper, we leverage this flexibility and build on the time-parallel formulation of iterated extended Kalman smoothers to formulate a parallel-in-time probabilistic numerical ODE solver. Instead of simulating the dynamical system sequentially in time, as done by current probabilistic solvers, the proposed method processes all time steps in parallel and thereby reduces the span cost from linear to logarithmic in the number of time steps. We demonstrate the effectiveness of our approach on a variety of ODEs and compare it to a range of both classic and probabilistic numerical ODE solvers.
In recent years, strong expectations have been raised for the possible power of quantum computing for solving difficult optimization problems, based on theoretical, asymptotic worst-case bounds. Can we expect this to have consequences for Linear and Integer Programming when solving instances of practically relevant size, a fundamental goal of Mathematical Programming, Operations Research and Algorithm Engineering? Answering this question faces a crucial impediment: The lack of sufficiently large quantum platforms prevents performing real-world tests for comparison with classical methods. In this paper, we present a quantum analog for classical runtime analysis when solving real-world instances of important optimization problems. To this end, we measure the expected practical performance of quantum computers by analyzing the expected gate complexity of a quantum algorithm. The lack of practical quantum platforms for experimental comparison is addressed by hybrid benchmarking, in which the algorithm is performed on a classical system, logging the expected cost of the various subroutines that are employed by the quantum versions. In particular, we provide an analysis of quantum methods for Linear Programming, for which recent work has provided asymptotic speedup through quantum subroutines for the Simplex method. We show that a practical quantum advantage for realistic problem sizes would require quantum gate operation times that are considerably below current physical limitations.
We give near-optimal algorithms for computing an ellipsoidal rounding of a convex polytope whose vertices are given in a stream. The approximation factor is linear in the dimension (as in John's theorem) and only loses an excess logarithmic factor in the aspect ratio of the polytope. Our algorithms are nearly optimal in two senses: first, their runtimes nearly match those of the most efficient known algorithms for the offline version of the problem. Second, their approximation factors nearly match a lower bound we show against a natural class of geometric streaming algorithms. In contrast to existing works in the streaming setting that compute ellipsoidal roundings only for centrally symmetric convex polytopes, our algorithms apply to general convex polytopes. We also show how to use our algorithms to construct coresets from a stream of points that approximately preserve both the ellipsoidal rounding and the convex hull of the original set of points.
We propose an efficient semi-Lagrangian characteristic mapping method for solving the one+one-dimensional Vlasov-Poisson equations with high precision on a coarse grid. The flow map is evolved numerically and exponential resolution in linear time is obtained. Global third-order convergence in space and time is shown and conservation properties are assessed. For benchmarking, we consider linear and nonlinear Landau damping and the two-stream instability. We compare the results with a Fourier pseudo-spectral method. The extreme fine-scale resolution features are illustrated showing the method's capabilities to efficiently treat filamentation in fusion plasma simulations.
The recently introduced graphical continuous Lyapunov models provide a new approach to statistical modeling of correlated multivariate data. The models view each observation as a one-time cross-sectional snapshot of a multivariate dynamic process in equilibrium. The covariance matrix for the data is obtained by solving a continuous Lyapunov equation that is parametrized by the drift matrix of the dynamic process. In this context, different statistical models postulate different sparsity patterns in the drift matrix, and it becomes a crucial problem to clarify whether a given sparsity assumption allows one to uniquely recover the drift matrix parameters from the covariance matrix of the data. We study this identifiability problem by representing sparsity patterns by directed graphs. Our main result proves that the drift matrix is globally identifiable if and only if the graph for the sparsity pattern is simple (i.e., does not contain directed two-cycles). Moreover, we present a necessary condition for generic identifiability and provide a computational classification of small graphs with up to 5 nodes.
We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed stochastic process is governed by the solution of a latent stochastic differential equation (SDE). Motivated by the challenges that arise when trying to learn an (almost arbitrary) latent neural SDE from large-scale data, such as efficient gradient computation, we take a step back and study a specific subclass instead. In our case, the SDE evolves on a homogeneous latent space and is induced by stochastic dynamics of the corresponding (matrix) Lie group. In learning problems, SDEs on the unit $n$-sphere are arguably the most relevant incarnation of this setup. Notably, for variational inference, the sphere not only facilitates using a truly uninformative prior SDE, but we also obtain a particularly simple and intuitive expression for the Kullback-Leibler divergence between the approximate posterior and prior process in the evidence lower bound. Experiments demonstrate that a latent SDE of the proposed type can be learned efficiently by means of an existing one-step geometric Euler-Maruyama scheme. Despite restricting ourselves to a less diverse class of SDEs, we achieve competitive or even state-of-the-art performance on various time series interpolation and classification benchmarks.
Spectral bounds form a powerful tool to estimate the minimum distances of quasi-cyclic codes. They generalize the defining set bounds of cyclic codes to those of quasi-cyclic codes. Based on the eigenvalues of quasi-cyclic codes and the corresponding eigenspaces, we provide an improved spectral bound for quasi-cyclic codes. Numerical results verify that the improved bound outperforms the Jensen bound in almost all cases. Based on the improved bound, we propose a general construction of quasi-cyclic codes with excellent designed minimum distances. For the quasi-cyclic codes produced by this general construction, the improved spectral bound is always sharper than the Jensen bound.
Probabilistic couplings are the foundation for many probabilistic relational program logics and arise when relating random sampling statements across two programs. In relational program logics, this manifests as dedicated coupling rules that, e.g., say we may reason as if two sampling statements return the same value. However, this approach fundamentally requires aligning or "synchronizing" the sampling statements of the two programs which is not always possible. In this paper, we develop Clutch, a higher-order probabilistic relational separation logic that addresses this issue by supporting asynchronous probabilistic couplings. We use Clutch to develop a logical step-indexed logical relational to reason about contextual refinement and equivalence of higher-order programs written in a rich language with higher-order local state and impredicative polymorphism. Finally, we demonstrate the usefulness of our approach on a number of case studies. All the results that appear in the paper have been formalized in the Coq proof assistant using the Coquelicot library and the Iris separation logic framework.
Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.
Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.
We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.