We study the statistical behavior of entanglement in quantum bipartite systems over fermionic Gaussian states as measured by von Neumann entropy and entanglement capacity. The focus is on the variance of von Neumann entropy and the mean entanglement capacity that belong to the so-defined second-order statistics. The main results are the exact yet explicit formulas of the two considered second-order statistics for fixed subsystem dimension differences. We also conjecture the exact variance of von Neumann entropy valid for arbitrary subsystem dimensions. Based on the obtained results, we analytically study the numerically observed phenomena of Gaussianity of von Neumann entropy and linear growth of average capacity.
Markov Chain Monte Carlo (MCMC) is one of the most powerful methods to sample from a given probability distribution, of which the Metropolis Adjusted Langevin Algorithm (MALA) is a variant wherein the gradient of the distribution is used towards faster convergence. However, being set up in the Euclidean framework, MALA might perform poorly in higher dimensional problems or in those involving anisotropic densities as the underlying non-Euclidean aspects of the geometry of the sample space remain unaccounted for. We make use of concepts from differential geometry and stochastic calculus on Riemannian manifolds to geometrically adapt a stochastic differential equation with a non-trivial drift term. This adaptation is also referred to as a stochastic development. We apply this method specifically to the Langevin diffusion equation and arrive at a geometrically adapted Langevin dynamics. This new approach far outperforms MALA, certain manifold variants of MALA, and other approaches such as Hamiltonian Monte Carlo (HMC), its adaptive variant the no-U-turn sampler (NUTS) implemented in Stan, especially as the dimension of the problem increases where often GALA is actually the only successful method. This is evidenced through several numerical examples that include parameter estimation of a broad class of probability distributions and a logistic regression problem.
We show how to use Stein variational gradient descent (SVGD) to carry out inference in Gaussian process (GP) models with non-Gaussian likelihoods and large data volumes. Markov chain Monte Carlo (MCMC) is extremely computationally intensive for these situations, but the parametric assumptions required for efficient variational inference (VI) result in incorrect inference when they encounter the multi-modal posterior distributions that are common for such models. SVGD provides a non-parametric alternative to variational inference which is substantially faster than MCMC. We prove that for GP models with Lipschitz gradients the SVGD algorithm monotonically decreases the Kullback-Leibler divergence from the sampling distribution to the true posterior. Our method is demonstrated on benchmark problems in both regression and classification, a multimodal posterior, and an air quality example with 550,134 spatiotemporal observations, showing substantial performance improvements over MCMC and VI.
The recently proposed statistical finite element (statFEM) approach synthesises measurement data with finite element models and allows for making predictions about the true system response. We provide a probabilistic error analysis for a prototypical statFEM setup based on a Gaussian process prior under the assumption that the noisy measurement data are generated by a deterministic true system response function that satisfies a second-order elliptic partial differential equation for an unknown true source term. In certain cases, properties such as the smoothness of the source term may be misspecified by the Gaussian process model. The error estimates we derive are for the expectation with respect to the measurement noise of the $L^2$-norm of the difference between the true system response and the mean of the statFEM posterior. The estimates imply polynomial rates of convergence in the numbers of measurement points and finite element basis functions and depend on the Sobolev smoothness of the true source term and the Gaussian process model. A numerical example for Poisson's equation is used to illustrate these theoretical results.
We couple the L1 discretization for Caputo derivative in time with spectral Galerkin method in space to devise a scheme that solves quasilinear subdiffusion equations. Both the diffusivity and the source are allowed to be nonlinear functions of the solution. We prove method's stability and convergence with spectral accuracy in space. The temporal order depends on solution's regularity in time. Further, we support our results with numerical simulations that utilize parallelism for spatial discretization. Moreover, as a side result we find asymptotic exact values of error constants along with their remainders for discretizations of Caputo derivative and fractional integrals. These constants are the smallest possible which improves the previously established results from the literature.
We consider infinite-horizon discounted Markov decision problems with finite state and action spaces. We show that with direct parametrization in the policy space, the weighted value function, although non-convex in general, is both quasi-convex and quasi-concave. While quasi-convexity helps explain the convergence of policy gradient methods to global optima, quasi-concavity hints at their convergence guarantees using arbitrarily large step sizes that are not dictated by the Lipschitz constant charactering smoothness of the value function. In particular, we show that when using geometrically increasing step sizes, a general class of policy mirror descent methods, including the natural policy gradient method and a projected Q-descent method, all enjoy a linear rate of convergence without relying on entropy or other strongly convex regularization. In addition, we develop a theory of weak gradient-mapping dominance and use it to prove sharper sublinear convergence rate of the projected policy gradient method. Finally, we also analyze the convergence rate of an inexact policy mirror descent method and estimate its sample complexity under a simple generative model.
We investigate the numerical implementation of the limiting equation for the phonon transport equation in the small Knudsen number regime. The main contribution is that we derive the limiting equation that achieves the second order convergence, and provide a numerical recipe for computing the Robin coefficients. These coefficients are obtained by solving an auxiliary half-space equation. Numerically the half-space equation is solved by a spectral method that relies on the even-odd decomposition to eliminate corner-point singularity. Numerical evidences will be presented to justify the second order asymptotic convergence rate.
This paper introduces a new theoretical framework for optimizing second-order behaviors of wireless networks. Unlike existing techniques for network utility maximization, which only considers first-order statistics, this framework models every random process by its mean and temporal variance. The inclusion of temporal variance makes this framework well-suited for modeling stateful fading wireless channels and emerging network performance metrics such as age-of-information (AoI). Using this framework, we sharply characterize the second-order capacity region of wireless access networks. We also propose a simple scheduling policy and prove that it can achieve every interior point in the second-order capacity region. To demonstrate the utility of this framework, we apply it for an important open problem: the optimization of AoI over Gilbert-Elliott channels. We show that this framework provides a very accurate characterization of AoI. Moreover, it leads to a tractable scheduling policy that outperforms other existing work.
In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) and provide excess population risks for some special classes of functions that are faster than the previous results of general convex and strongly convex functions. In the first part of the paper, we study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $\theta>1$. Specifically, we first show that under some mild assumptions on the loss functions, there is an algorithm whose output could achieve an upper bound of $\tilde{O}((\frac{1}{\sqrt{n}}+\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $(\epsilon, \delta)$-DP when $\theta\geq 2$, here $n$ is the sample size and $d$ is the dimension of the space. Then we address the inefficiency issue, improve the upper bounds by $\text{Poly}(\log n)$ factors and extend to the case where $\theta\geq \bar{\theta}>1$ for some known $\bar{\theta}$. Next we show that the excess population risk of population functions satisfying TNC with parameter $\theta\geq 2$ is always lower bounded by $\Omega((\frac{d}{n\epsilon})^\frac{\theta}{\theta-1}) $ and $\Omega((\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $\epsilon$-DP and $(\epsilon, \delta)$-DP, respectively. In the second part, we focus on a special case where the population risk function is strongly convex. Unlike the previous studies, here we assume the loss function is {\em non-negative} and {\em the optimal value of population risk is sufficiently small}. With these additional assumptions, we propose a new method whose output could achieve an upper bound of $O(\frac{d\log\frac{1}{\delta}}{n^2\epsilon^2}+\frac{1}{n^{\tau}})$ for any $\tau\geq 1$ in $(\epsilon,\delta)$-DP model if the sample size $n$ is sufficiently large.
(Gradient) Expectation Maximization (EM) is a widely used algorithm for estimating the maximum likelihood of mixture models or incomplete data problems. A major challenge facing this popular technique is how to effectively preserve the privacy of sensitive data. Previous research on this problem has already lead to the discovery of some Differentially Private (DP) algorithms for (Gradient) EM. However, unlike in the non-private case, existing techniques are not yet able to provide finite sample statistical guarantees. To address this issue, we propose in this paper the first DP version of (Gradient) EM algorithm with statistical guarantees. Moreover, we apply our general framework to three canonical models: Gaussian Mixture Model (GMM), Mixture of Regressions Model (MRM) and Linear Regression with Missing Covariates (RMC). Specifically, for GMM in the DP model, our estimation error is near optimal in some cases. For the other two models, we provide the first finite sample statistical guarantees. Our theory is supported by thorough numerical experiments.
In the present paper we study the asymptotic behavior of the auto-covariance function for Ornstein-Uhlenbeck (OU) processes driven by Gaussian noises with stationary and non-stationary increments and for Hermite OU processes. Our results are generalizations of the corresponding results of Cheridito et al. \cite{CKM} and Kaarakka and Salminen \cite{KS}.