This paper explores how and when to use common random number (CRN) simulation to evaluate MCMC convergence rates. We discuss how CRN simulation is closely related to theoretical convergence rate techniques such as one-shot coupling and coupling from the past. We present conditions under which the CRN technique generates an unbiased estimate of the Wasserstein distance between two random variables. We also discuss how unbiasedness of the Wasserstein distance between two Markov chains over a single iteration does not extend to unbiasedness over multiple iterations. We provide an upper bound on the Wasserstein distance of a Markov chain to its stationary distribution after $N$ steps in terms of averages over CRN simulations. We apply our result to a Bayesian regression Gibbs sampler.
We extend a Discrete Time Random Walk (DTRW) numerical scheme to simulate the anomalous diffusion of financial market orders in a simulated order book. Here using random walks with Sibuya waiting times to include a time-dependent stochastic forcing function with non-uniformly sampled times between order book events in the setting of fractional diffusion. This models the fluid limit of an order book by modelling the continuous arrival, cancellation and diffusion of orders in the presence of information shocks. We study the impulse response and stylised facts of orders undergoing anomalous diffusion for different forcing functions and model parameters. Concretely, we demonstrate the price impact for flash limit-orders and market orders and show how the numerical method generate kinks in the price impact. We use cubic spline interpolation to generate smoothed price impact curves. The work promotes the use of non-uniform sampling in the presence of diffusive dynamics as the preferred simulation method.
This paper studies two hybrid discontinuous Galerkin (HDG) discretizations for the velocity-density formulation of the compressible Stokes equations with respect to several desired structural properties, namely provable convergence, the preservation of non-negativity and mass constraints for the density, and gradient-robustness. The later property dramatically enhances the accuracy in well-balanced situations, such as the hydrostatic balance where the pressure gradient balances the gravity force. One of the studied schemes employs an H(div)-conforming velocity ansatz space which ensures all mentioned properties, while a fully discontinuous method is shown to satisfy all properties but the gradient-robustness. Also higher-order schemes for both variants are presented and compared in three numerical benchmark problems. The final example shows the importance also for non-hydrostatic well-balanced states for the compressible Navier-Stokes equations.
We investigate the use of multilevel Monte Carlo (MLMC) methods for estimating the expectation of discretized random fields. Specifically, we consider a setting in which the input and output vectors of the numerical simulators have inconsistent dimensions across the multilevel hierarchy. This requires the introduction of grid transfer operators borrowed from multigrid methods. Starting from a simple 1D illustration, we demonstrate numerically that the resulting MLMC estimator deteriorates the estimation of high-frequency components of the discretized expectation field compared to a Monte Carlo (MC) estimator. By adapting mathematical tools initially developed for multigrid methods, we perform a theoretical spectral analysis of the MLMC estimator of the expectation of discretized random fields, in the specific case of linear, symmetric and circulant simulators. This analysis provides a spectral decomposition of the variance into contributions associated with each scale component of the discretized field. We then propose improved MLMC estimators using a filtering mechanism similar to the smoothing process of multigrid methods. The filtering operators improve the estimation of both the small- and large-scale components of the variance, resulting in a reduction of the total variance of the estimator. These improvements are quantified for the specific class of simulators considered in our spectral analysis. The resulting filtered MLMC (F-MLMC) estimator is applied to the problem of estimating the discretized variance field of a diffusion-based covariance operator, which amounts to estimating the expectation of a discretized random field. The numerical experiments support the conclusions of the theoretical analysis even with non-linear simulators, and demonstrate the improvements brought by the proposed F-MLMC estimator compared to both a crude MC and an unfiltered MLMC estimator.
Solutions to many important partial differential equations satisfy bounds constraints, but approximations computed by finite element or finite difference methods typically fail to respect the same conditions. Chang and Nakshatrala enforce such bounds in finite element methods through the solution of variational inequalities rather than linear variational problems. Here, we provide a theoretical justification for this method, including higher-order discretizations. We prove an abstract best approximation result for the linear variational inequality and estimates showing that bounds-constrained polynomials provide comparable approximation power to standard spaces. For any unconstrained approximation to a function, there exists a constrained approximation which is comparable in the $W^{1,p}$ norm. In practice, one cannot efficiently represent and manipulate the entire family of bounds-constrained polynomials, but applying bounds constraints to the coefficients of a polynomial in the Bernstein basis guarantees those constraints on the polynomial. Although our theoretical results do not guaruntee high accuracy for this subset of bounds-constrained polynomials, numerical results indicate optimal orders of accuracy for smooth solutions and sharp resolution of features in convection-diffusion problems, all subject to bounds constraints.
Learning tasks play an increasingly prominent role in quantum information and computation. They range from fundamental problems such as state discrimination and metrology over the framework of quantum probably approximately correct (PAC) learning, to the recently proposed shadow variants of state tomography. However, the many directions of quantum learning theory have so far evolved separately. We propose a general mathematical formalism for describing quantum learning by training on classical-quantum data and then testing how well the learned hypothesis generalizes to new data. In this framework, we prove bounds on the expected generalization error of a quantum learner in terms of classical and quantum information-theoretic quantities measuring how strongly the learner's hypothesis depends on the specific data seen during training. To achieve this, we use tools from quantum optimal transport and quantum concentration inequalities to establish non-commutative versions of decoupling lemmas that underlie recent information-theoretic generalization bounds for classical machine learning. Our framework encompasses and gives intuitively accessible generalization bounds for a variety of quantum learning scenarios such as quantum state discrimination, PAC learning quantum states, quantum parameter estimation, and quantumly PAC learning classical functions. Thereby, our work lays a foundation for a unifying quantum information-theoretic perspective on quantum learning.
This paper considers the extension of data-enabled predictive control (DeePC) to nonlinear systems via general basis functions. Firstly, we formulate a basis functions DeePC behavioral predictor and we identify necessary and sufficient conditions for equivalence with a corresponding basis functions multi-step identified predictor. The derived conditions yield a dynamic regularization cost function that enables a well-posed (i.e., consistent) basis functions formulation of nonlinear DeePC. To optimize computational efficiency of basis functions DeePC we further develop two alternative formulations that use a simpler, sparse regularization cost function and ridge regression, respectively. Consistency implications for Koopman DeePC as well as several methods for constructing the basis functions representation are also indicated. The effectiveness of the developed consistent basis functions DeePC formulations is illustrated on a benchmark nonlinear pendulum state-space model, for both noise free and noisy data.
In this paper, we investigate the functional central limit theorem and the Marcinkiewicz strong law of large numbers for U-statistics having absolutely regular data and taking value in a separable Hilbert space. The novelty of our approach consists in using coupling in order to formulate a deviation inequality for original $U$-statistic, where the upper bound involves the mixing coefficient and the tail of several U-statistics of i.i.d. data. The presented results improve the known results in several directions: the case of metric space valued data is considered as well as Hilbert space valued, and the mixing rates are less restrictive in a wide range of parameters.
Causal representation learning algorithms discover lower-dimensional representations of data that admit a decipherable interpretation of cause and effect; as achieving such interpretable representations is challenging, many causal learning algorithms utilize elements indicating prior information, such as (linear) structural causal models, interventional data, or weak supervision. Unfortunately, in exploratory causal representation learning, such elements and prior information may not be available or warranted. Alternatively, scientific datasets often have multiple modalities or physics-based constraints, and the use of such scientific, multimodal data has been shown to improve disentanglement in fully unsupervised settings. Consequently, we introduce a causal representation learning algorithm (causalPIMA) that can use multimodal data and known physics to discover important features with causal relationships. Our innovative algorithm utilizes a new differentiable parametrization to learn a directed acyclic graph (DAG) together with a latent space of a variational autoencoder in an end-to-end differentiable framework via a single, tractable evidence lower bound loss function. We place a Gaussian mixture prior on the latent space and identify each of the mixtures with an outcome of the DAG nodes; this novel identification enables feature discovery with causal relationships. Tested against a synthetic and a scientific dataset, our results demonstrate the capability of learning an interpretable causal structure while simultaneously discovering key features in a fully unsupervised setting.
When modeling a vector of risk variables, extreme scenarios are often of special interest. The peaks-over-thresholds method hinges on the notion that, asymptotically, the excesses over a vector of high thresholds follow a multivariate generalized Pareto distribution. However, existing literature has primarily concentrated on the setting when all risk variables are always large simultaneously. In reality, this assumption is often not met, especially in high dimensions. In response to this limitation, we study scenarios where distinct groups of risk variables may exhibit joint extremes while others do not. These discernible groups are derived from the angular measure inherent in the corresponding max-stable distribution, whence the term extreme direction. We explore such extreme directions within the framework of multivariate generalized Pareto distributions, with a focus on their probability density functions in relation to an appropriate dominating measure. Furthermore, we provide a stochastic construction that allows any prespecified set of risk groups to constitute the distribution's extreme directions. This construction takes the form of a smoothed max-linear model and accommodates the full spectrum of conceivable max-stable dependence structures. Additionally, we introduce a generic simulation algorithm tailored for multivariate generalized Pareto distributions, offering specific implementations for extensions of the logistic and H\"usler-Reiss families capable of carrying arbitrary extreme directions.
We construct a Convolution Quadrature (CQ) scheme for the quasilinear subdiffusion equation and supply it with the fast and oblivious implementation. In particular we find a condition for the CQ to be admissible and discretize the spatial part of the equation with the Finite Element Method. We prove the unconditional stability and convergence of the scheme and find a bound on the error. As a passing result, we also obtain a discrete Gronwall inequality for the CQ, which is a crucial ingredient of our convergence proof based on the energy method. The paper is concluded with numerical examples verifying convergence and computation time reduction when using fast and oblivious quadrature.