The dielectric response function and its inverse are crucial physical quantities in materials science. We propose an accurate and efficient strategy to invert the dielectric function matrix. The GW approximation, a powerful approach to accurately describe many-body excited states, is taken as an application to demonstrate accuracy and efficiency. We incorporate the interpolative separable density fitting (ISDF) algorithm with Sherman--Morrison--Woodbury (SMW) formula to accelerate the inversion process by exploiting low-rank properties of dielectric function in plane-wave GW calculations. Our ISDF--SMW strategy produces accurate quasiparticle energies with $O(N_{\mathrm{r}}N_{\mathrm{e}}^2)$ computational cost $(N_{\mathrm{e}}$ is the number of electrons and $N_{\mathrm{r}}=100$--$1000N_{\mathrm{e}}$ is the number of grid points) with negligible small error of $0.03$~eV for both complex molecules and solids. This new strategy for inverting the dielectric matrix can be \(50\times\) faster than the current state-of-the-art implementation in BerkeleyGW, resulting in two orders of magnitude speedup for total GW calculations.
Overdamped Langevin dynamics are reversible stochastic differential equations which are commonly used to sample probability measures in high-dimensional spaces, such as the ones appearing in computational statistical physics and Bayesian inference. By varying the diffusion coefficient, there are in fact infinitely many overdamped Langevin dynamics which are reversible with respect to the target probability measure at hand. This suggests to optimize the diffusion coefficient in order to increase the convergence rate of the dynamics, as measured by the spectral gap of the generator associated with the stochastic differential equation. We analytically study this problem here, obtaining in particular necessary conditions on the optimal diffusion coefficient. We also derive an explicit expression of the optimal diffusion in some appropriate homogenized limit. Numerical results, both relying on discretizations of the spectral gap problem and Monte Carlo simulations of the stochastic dynamics, demonstrate the increased quality of the sampling arising from an appropriate choice of the diffusion coefficient.
Capturing the extremal behaviour of data often requires bespoke marginal and dependence models which are grounded in rigorous asymptotic theory, and hence provide reliable extrapolation into the upper tails of the data-generating distribution. We present a toolbox of four methodological frameworks, motivated by modern extreme value theory, that can be used to accurately estimate extreme exceedance probabilities or the corresponding level in either a univariate or multivariate setting. Our frameworks were used to facilitate the winning contribution of Team Yalla to the EVA (2023) Conference Data Challenge, which was organised for the 13$^\text{th}$ International Conference on Extreme Value Analysis. This competition comprised seven teams competing across four separate sub-challenges, with each requiring the modelling of data simulated from known, yet highly complex, statistical distributions, and extrapolation far beyond the range of the available samples in order to predict probabilities of extreme events. Data were constructed to be representative of real environmental data, sampled from the fantasy country of "Utopia"
The article introduces a method to learn dynamical systems that are governed by Euler--Lagrange equations from data. The method is based on Gaussian process regression and identifies continuous or discrete Lagrangians and is, therefore, structure preserving by design. A rigorous proof of convergence as the distance between observation data points converges to zero is provided. Next to convergence guarantees, the method allows for quantification of model uncertainty, which can provide a basis of adaptive sampling techniques. We provide efficient uncertainty quantification of any observable that is linear in the Lagrangian, including of Hamiltonian functions (energy) and symplectic structures, which is of interest in the context of system identification. The article overcomes major practical and theoretical difficulties related to the ill-posedness of the identification task of (discrete) Lagrangians through a careful design of geometric regularisation strategies and through an exploit of a relation to convex minimisation problems in reproducing kernel Hilbert spaces.
In this article, an efficient numerical method for computing finite-horizon controllability Gramians in Cholesky-factored form is proposed. The method is applicable to general dense matrices of moderate size and produces a Cholesky factor of the Gramian without computing the full product. In contrast to other methods applicable to this task, the proposed method is a generalization of the scaling-and-squaring approach for approximating the matrix exponential. It exploits a similar doubling formula for the Gramian, and thereby keeps the required computational effort modest. Most importantly, a rigorous backward error analysis is provided, which guarantees that the approximation is accurate to the round-off error level in double precision. This accuracy is illustrated in practice on a large number of standard test examples. The method has been implemented in the Julia package FiniteHorizonGramians.jl, which is available online under the MIT license. Code for reproducing the experimental results is included in this package, as well as code for determining the optimal method parameters. The analysis can thus easily be adapted to a different finite-precision arithmetic.
In this paper, to address the optimization problem on a compact matrix manifold, we introduce a novel algorithmic framework called the Transformed Gradient Projection (TGP) algorithm, using the projection onto this compact matrix manifold. Compared with the existing algorithms, the key innovation in our approach lies in the utilization of a new class of search directions and various stepsizes, including the Armijo, nonmonotone Armijo, and fixed stepsizes, to guide the selection of the next iterate. Our framework offers flexibility by encompassing the classical gradient projection algorithms as special cases, and intersecting the retraction-based line-search algorithms. Notably, our focus is on the Stiefel or Grassmann manifold, revealing that many existing algorithms in the literature can be seen as specific instances within our proposed framework, and this algorithmic framework also induces several new special cases. Then, we conduct a thorough exploration of the convergence properties of these algorithms, considering various search directions and stepsizes. To achieve this, we extensively analyze the geometric properties of the projection onto compact matrix manifolds, allowing us to extend classical inequalities related to retractions from the literature. Building upon these insights, we establish the weak convergence, convergence rate, and global convergence of TGP algorithms under three distinct stepsizes. In cases where the compact matrix manifold is the Stiefel or Grassmann manifold, our convergence results either encompass or surpass those found in the literature. Finally, through a series of numerical experiments, we observe that the TGP algorithms, owing to their increased flexibility in choosing search directions, outperform classical gradient projection and retraction-based line-search algorithms in several scenarios.
Realizing computationally complex quantum circuits in the presence of noise and imperfections is a challenging task. While fault-tolerant quantum computing provides a route to reducing noise, it requires a large overhead for generic algorithms. Here, we develop and analyze a hardware-efficient, fault-tolerant approach to realizing complex sampling circuits. We co-design the circuits with the appropriate quantum error correcting codes for efficient implementation in a reconfigurable neutral atom array architecture, constituting what we call a fault-tolerant compilation of the sampling algorithm. Specifically, we consider a family of $[[2^D , D, 2]]$ quantum error detecting codes whose transversal and permutation gate set can realize arbitrary degree-$D$ instantaneous quantum polynomial (IQP) circuits. Using native operations of the code and the atom array hardware, we compile a fault-tolerant and fast-scrambling family of such IQP circuits in a hypercube geometry, realized recently in the experiments by Bluvstein et al. [Nature 626, 7997 (2024)]. We develop a theory of second-moment properties of degree-$D$ IQP circuits for analyzing hardness and verification of random sampling by mapping to a statistical mechanics model. We provide evidence that sampling from hypercube IQP circuits is classically hard to simulate and analyze the linear cross-entropy benchmark (XEB) in comparison to the average fidelity. To realize a fully scalable approach, we first show that Bell sampling from degree-$4$ IQP circuits is classically intractable and can be efficiently validated. We further devise new families of $[[O(d^D),D,d]]$ color codes of increasing distance $d$, permitting exponential error suppression for transversal IQP sampling. Our results highlight fault-tolerant compiling as a powerful tool in co-designing algorithms with specific error-correcting codes and realistic hardware.
We analyze a bilinear optimal control problem for the Stokes--Brinkman equations: the control variable enters the state equations as a coefficient. In two- and three-dimensional Lipschitz domains, we perform a complete continuous analysis that includes the existence of solutions and first- and second-order optimality conditions. We also develop two finite element methods that differ fundamentally in whether the admissible control set is discretized or not. For each of the proposed methods, we perform a convergence analysis and derive a priori error estimates; the latter under the assumption that the domain is convex. Finally, assuming that the domain is Lipschitz, we develop an a posteriori error estimator for each discretization scheme and obtain a global reliability bound.
Model averaging (MA), a technique for combining estimators from a set of candidate models, has attracted increasing attention in machine learning and statistics. In the existing literature, there is an implicit understanding that MA can be viewed as a form of shrinkage estimation that draws the response vector towards the subspaces spanned by the candidate models. This paper explores this perspective by establishing connections between MA and shrinkage in a linear regression setting with multiple nested models. We first demonstrate that the optimal MA estimator is the best linear estimator with monotonically non-increasing weights in a Gaussian sequence model. The Mallows MA (MMA), which estimates weights by minimizing the Mallows' $C_p$ over the unit simplex, can be viewed as a variation of the sum of a set of positive-part Stein estimators. Indeed, the latter estimator differs from the MMA only in that its optimization of Mallows' $C_p$ is within a suitably relaxed weight set. Motivated by these connections, we develop a novel MA procedure based on a blockwise Stein estimation. The resulting Stein-type MA estimator is asymptotically optimal across a broad parameter space when the variance is known. Numerical results support our theoretical findings. The connections established in this paper may open up new avenues for investigating MA from different perspectives. A discussion on some topics for future research concludes the paper.
The e-BH procedure is an e-value-based multiple testing procedure that provably controls the false discovery rate (FDR) under any dependence structure between the e-values. Despite this appealing theoretical FDR control guarantee, the e-BH procedure often suffers from low power in practice. In this paper, we propose a general framework that boosts the power of e-BH without sacrificing its FDR control under arbitrary dependence. This is achieved by the technique of conditional calibration, where we take as input the e-values and calibrate them to be a set of "boosted e-values" that are guaranteed to be no less -- and are often more -- powerful than the original ones. Our general framework is explicitly instantiated in three classes of multiple testing problems: (1) testing under parametric models, (2) conditional independence testing under the model-X setting, and (3) model-free conformalized selection. Extensive numerical experiments show that our proposed method significantly improves the power of e-BH while continuing to control the FDR. We also demonstrate the effectiveness of our method through an application to an observational study dataset for identifying individuals whose counterfactuals satisfy certain properties.
Logistic regression is widely used in many areas of knowledge. Several works compare the performance of lasso and maximum likelihood estimation in logistic regression. However, part of these works do not perform simulation studies and the remaining ones do not consider scenarios in which the ratio of the number of covariates to sample size is high. In this work, we compare the discrimination performance of lasso and maximum likelihood estimation in logistic regression using simulation studies and applications. Variable selection is done both by lasso and by stepwise when maximum likelihood estimation is used. We consider a wide range of values for the ratio of the number of covariates to sample size. The main conclusion of the work is that lasso has a better discrimination performance than maximum likelihood estimation when the ratio of the number of covariates to sample size is high.