Combinatorial optimization - a field of research addressing problems that feature strongly in a wealth of scientific and industrial contexts - has been identified as one of the core potential fields of applicability of quantum computers. It is still unclear, however, to what extent quantum algorithms can actually outperform classical algorithms for this type of problems. In this work, by resorting to computational learning theory and cryptographic notions, we prove that quantum computers feature an in-principle super-polynomial advantage over classical computers in approximating solutions to combinatorial optimization problems. Specifically, building on seminal work by Kearns and Valiant and introducing a new reduction, we identify special types of problems that are hard for classical computers to approximate up to polynomial factors. At the same time, we give a quantum algorithm that can efficiently approximate the optimal solution within a polynomial factor. The core of the quantum advantage discovered in this work is ultimately borrowed from Shor's quantum algorithm for factoring. Concretely, we prove a super-polynomial advantage for approximating special instances of the so-called integer programming problem. In doing so, we provide an explicit end-to-end construction for advantage bearing instances. This result shows that quantum devices have, in principle, the power to approximate combinatorial optimization solutions beyond the reach of classical efficient algorithms. Our results also give clear guidance on how to construct such advantage-bearing problem instances.
Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks. We consider the transductive setting, where decisions are made on a test sample of $m$ new points, giving rise to $m$ conformal $p$-values. {While classical results only concern their marginal distribution, we show that their joint distribution follows a P\'olya urn model, and establish a concentration inequality for their empirical distribution function.} The results hold for arbitrary exchangeable scores, including {\it adaptive} ones that can use the covariates of the test+calibration samples at training stage for increased accuracy. We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks of current interest: interval prediction for transductive transfer learning and novelty detection based on two-class classification.
The fundamental computational issues in Bayesian inverse problems (BIPs) governed by partial differential equations (PDEs) stem from the requirement of repeated forward model evaluations. A popular strategy to reduce such cost is to replace expensive model simulations by computationally efficient approximations using operator learning, motivated by recent progresses in deep learning. However, using the approximated model directly may introduce a modeling error, exacerbating the already ill-posedness of inverse problems. Thus, balancing between accuracy and efficiency is essential for the effective implementation of such approaches. To this end, we develop an adaptive operator learning framework that can reduce modeling error gradually by forcing the surrogate to be accurate in local areas. This is accomplished by fine-tuning the pre-trained approximate model during the inversion process with adaptive points selected by a greedy algorithm, which requires only a few forward model evaluations. To validate our approach, we adopt DeepOnet to construct the surrogate and use unscented Kalman inversion (UKI) to approximate the solution of BIPs, respectively. Furthermore, we present rigorous convergence guarantee in the linear case using the framework of UKI. We test the approach on several benchmarks, including the Darcy flow, the heat source inversion problem, and the reaction diffusion problems. Numerical results demonstrate that our method can significantly reduce computational costs while maintaining inversion accuracy.
For problems of time-harmonic scattering by rational polygonal obstacles, embedding formulae express the far-field pattern induced by any incident plane wave in terms of the far-field patterns for a relatively small (frequency-independent) set of canonical incident angles. Although these remarkable formulae are exact in theory, here we demonstrate that: (i) they are highly sensitive to numerical errors in practice, and; (ii) direct calculation of the coefficients in these formulae may be impossible for particular sets of canonical incident angles, even in exact arithmetic. Only by overcoming these practical issues can embedding formulae provide a highly efficient approach to computing the far-field pattern induced by a large number of incident angles. Here we propose solutions for problems (i) and (ii), backed up by theory and numerical experiments. Problem (i) is solved using techniques from computational complex analysis: we reformulate the embedding formula as a complex contour integral and prove that this is much less sensitive to numerical errors. In practice, this contour integral can be efficiently evaluated by residue calculus. Problem (ii) is addressed using techniques from numerical linear algebra: we oversample, considering more canonical incident angles than are necessary, thus expanding the space of valid coefficients vectors. The coefficients vectors can then be selected using either a least squares approach or column subset selection.
Generative diffusion models have achieved spectacular performance in many areas of generative modeling. While the fundamental ideas behind these models come from non-equilibrium physics, in this paper we show that many aspects of these models can be understood using the tools of equilibrium statistical mechanics. Using this reformulation, we show that generative diffusion models undergo second-order phase transitions corresponding to symmetry breaking phenomena. We argue that this lead to a form of instability that lies at the heart of their generative capabilities and that can be described by a set of mean field critical exponents. We conclude by analyzing recent work connecting diffusion models and associative memory networks in view of the thermodynamic formulations.
An increasingly common viewpoint is that protein dynamics data sets reside in a non-linear subspace of low conformational energy. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich structure to account for a wide range of geometries that can be modelled after an energy landscape. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised to data on a Riemannian manifold. In the context of protein dynamics, a conceptual challenge comes from the lack of a suitable smooth manifold and the lack of guidelines for constructing a smooth Riemannian structure based on an energy landscape. In addition, computational feasibility in computing geodesics and related mappings poses a major challenge. This work considers these challenges. The first part of the paper develops a novel local approximation technique for computing geodesics and related mappings on Riemannian manifolds in a computationally feasible manner. The second part constructs a smooth manifold of point clouds modulo rigid body group actions and a Riemannian structure that is based on an energy landscape for protein conformations. The resulting Riemannian geometry is tested on several data analysis tasks relevant for protein dynamics data. It performs exceptionally well on coarse-grained molecular dynamics simulated data. In particular, the geodesics with given start- and end-points approximately recover corresponding molecular dynamics trajectories for proteins that undergo relatively ordered transitions with medium sized deformations. The Riemannian protein geometry also gives physically realistic summary statistics and retrieves the underlying dimension even for large-sized deformations within seconds on a laptop.
The evaluation of clustering algorithms can involve running them on a variety of benchmark problems, and comparing their outputs to the reference, ground-truth groupings provided by experts. Unfortunately, many research papers and graduate theses consider only a small number of datasets. Also, the fact that there can be many equally valid ways to cluster a given problem set is rarely taken into account. In order to overcome these limitations, we have developed a framework whose aim is to introduce a consistent methodology for testing clustering algorithms. Furthermore, we have aggregated, polished, and standardised many clustering benchmark dataset collections referred to across the machine learning and data mining literature, and included new datasets of different dimensionalities, sizes, and cluster types. An interactive datasets explorer, the documentation of the Python API, a description of the ways to interact with the framework from other programming languages such as R or MATLAB, and other details are all provided at <//clustering-benchmarks.gagolewski.com>.
We propose a method to detect model misspecifications in nonlinear causal additive and potentially heteroscedastic noise models. We aim to identify predictor variables for which we can infer the causal effect even in cases of such misspecification. We develop a general framework based on knowledge of the multivariate observational data distribution and we then propose an algorithm for finite sample data, discuss its asymptotic properties, and illustrate its performance on simulated and real data.
We develop a novel discontinuous Galerkin method for solving the rotating thermal shallow water equations (TRSW) on a curvilinear mesh. Our method is provably entropy stable, conserves mass, buoyancy and vorticity, while also semi-discretely conserving energy. This is achieved by using novel numerical fluxes and splitting the pressure and convection operators. We implement our method on a cubed sphere mesh and numerically verify our theoretical results. Our experiments demonstrate the robustness of the method for a regime of well developed turbulence, where it can be run stably without any dissipation. The entropy stable fluxes are sufficient to control the grid scale noise generated by geostrophic turbulence, eliminating the need for artificial stabilization.
Classical inequality curves and inequality measures are defined for distributions with finite mean value. Moreover, their empirical counterparts are not resistant to outliers. For these reasons, quantile versions of known inequality curves such as the Lorenz, Bonferroni, Zenga and $D$ curves, and quantile versions of inequality measures such as the Gini, Bonferroni, Zenga and $D$ indices have been proposed in the literature. We propose various nonparametric estimators of quantile versions of inequality curves and inequality measures, prove their consistency, and compare their accuracy in a~simulation study. We also give examples of the use of quantile versions of inequality measures in real data analysis.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.