The multivariate inverse hypergeometric (MIH) distribution is an extension of the negative multinomial (NM) model that accounts for sampling without replacement in a finite population. Even though most studies on longitudinal count data with a specific number of `failures' occur in a finite setting, the NM model is typically chosen over the more accurate MIH model. This raises the question: How much information is lost when inferring with the approximate NM model instead of the true MIH model? The loss is quantified by a measure called deficiency in statistics. In this paper, asymptotic bounds for the deficiencies between MIH and NM experiments are derived, as well as between MIH and the corresponding multivariate normal experiments with the same mean-covariance structure. The findings are supported by a local approximation for the log-ratio of the MIH and NM probability mass functions, and by Hellinger distance bounds.
I propose an alternative algorithm to compute the MMS voting rule. Instead of using linear programming, in this new algorithm the maximin support value of a committee is computed using a sequence of maximum flow problems.
The key result of this paper is to find all the joint distributions of random vectors whose sums $S=X_1+\ldots+X_d$ are minimal in convex order in the class of symmetric Bernoulli distributions. The minimal convex sums distributions are known to be strongly negatively dependent. Beyond their interest per se, these results enable us to explore negative dependence within the class of copulas. In fact, there are two classes of copulas that can be built from multivariate symmetric Bernoulli distributions: the extremal mixture copulas, and the FGM copulas. We study the extremal negative dependence structure of the copulas corresponding to symmetric Bernoulli vectors with minimal convex sums and we explicitly find a class of minimal dependence copulas. Our main results stem from the geometric and algebraic representations of multivariate symmetric Bernoulli distributions, which effectively encode several of their statistical properties.
We introduce a flexible method to simultaneously infer both the drift and volatility functions of a discretely observed scalar diffusion. We introduce spline bases to represent these functions and develop a Markov chain Monte Carlo algorithm to infer, a posteriori, the coefficients of these functions in the spline basis. A key innovation is that we use spline bases to model transformed versions of the drift and volatility functions rather than the functions themselves. The output of the algorithm is a posterior sample of plausible drift and volatility functions that are not constrained to any particular parametric family. The flexibility of this approach provides practitioners a powerful investigative tool, allowing them to posit a variety of parametric models to better capture the underlying dynamics of their processes of interest. We illustrate the versatility of our method by applying it to challenging datasets from finance, paleoclimatology, and astrophysics. In view of the parametric diffusion models widely employed in the literature for those examples, some of our results are surprising since they call into question some aspects of these models.
We develop a numerical method for the Westervelt equation, an important equation in nonlinear acoustics, in the form where the attenuation is represented by a class of non-local in time operators. A semi-discretisation in time based on the trapezoidal rule and A-stable convolution quadrature is stated and analysed. Existence and regularity analysis of the continuous equations informs the stability and error analysis of the semi-discrete system. The error analysis includes the consideration of the singularity at $t = 0$ which is addressed by the use of a correction in the numerical scheme. Extensive numerical experiments confirm the theory.
We present algorithms for computing strongly singular and near-singular surface integrals over curved triangular patches, based on singularity subtraction, the continuation approach, and transplanted Gauss quadrature. We demonstrate the accuracy and robustness of our method for quadratic basis functions and quadratic triangles by integrating it into a boundary element code and solving several scattering problems in 3D. We also give numerical evidence that the utilization of curved boundary elements enhances computational efficiency compared to conventional planar elements.
The nonlinear Poisson-Boltzmann equation (NPBE) is an elliptic partial differential equation used in applications such as protein interactions and biophysical chemistry (among many others). It describes the nonlinear electrostatic potential of charged bodies submerged in an ionic solution. The kinetic presence of the solvent molecules introduces randomness to the shape of a protein, and thus a more accurate model that incorporates these random perturbations of the domain is analyzed to compute the statistics of quantities of interest of the solution. When the parameterization of the random perturbations is high-dimensional, this calculation is intractable as it is subject to the curse of dimensionality. However, if the solution of the NPBE varies analytically with respect to the random parameters, the problem becomes amenable to techniques such as sparse grids and deep neural networks. In this paper, we show analyticity of the solution of the NPBE with respect to analytic perturbations of the domain by using the analytic implicit function theorem and the domain mapping method. Previous works have shown analyticity of solutions to linear elliptic equations but not for nonlinear problems. We further show how to derive \emph{a priori} bounds on the size of the region of analyticity. This method is applied to the trypsin molecule to demonstrate that the convergence rates of the quantity of interest are consistent with the analyticity result. Furthermore, the approach developed here is sufficiently general enough to be applied to other nonlinear problems in uncertainty quantification.
The complexity class Quantum Statistical Zero-Knowledge ($\mathsf{QSZK}$) captures computational difficulties of the time-bounded quantum state testing problem with respect to the trace distance, known as the Quantum State Distinguishability Problem (QSDP) introduced by Watrous (FOCS 2002). However, QSDP is in $\mathsf{QSZK}$ merely within the constant polarizing regime, similar to its classical counterpart shown by Sahai and Vadhan (JACM 2003) due to the polarization lemma (error reduction for SDP). Recently, Berman, Degwekar, Rothblum, and Vasudevan (TCC 2019) extended the $\mathsf{SZK}$ containment for SDP beyond the polarizing regime via the time-bounded distribution testing problems with respect to the triangular discrimination and the Jensen-Shannon divergence. Our work introduces proper quantum analogs for these problems by defining quantum counterparts for triangular discrimination. We investigate whether the quantum analogs behave similarly to their classical counterparts and examine the limitations of existing approaches to polarization regarding quantum distances. These new $\mathsf{QSZK}$-complete problems improve $\mathsf{QSZK}$ containments for QSDP beyond the polarizing regime and establish a simple $\mathsf{QSZK}$-hardness for the quantum entropy difference problem (QEDP) defined by Ben-Aroya, Schwartz, and Ta-Shma (ToC 2010). Furthermore, we prove that QSDP with some exponentially small errors is in $\mathsf{PP}$, while the same problem without error is in $\mathsf{NQP}$.
We consider a kinetic model of an N-species gas mixture modeled with quantum Bhatnagar-Gross-Krook (BGK) collision operators. The collision operators consist of a relaxation to a Maxwell distribution in the classical case, a Fermi distribution for fermions and a Bose-Einstein distribution for bosons. In this paper we present a numerical method for simulating this model, which uses an Implicit-Explicit (IMEX) scheme to minimize a certain potential function. This is motivated by theoretical considerations coming from entropy minimization. We show that theoretical properties such as conservation of mass, total momentum and total energy as well as positivity of the distribution functions are preserved by the numerical method presented in this paper, and illustrate its usefulness and effectiveness with numerical examples
It often happens that free algebras for a given theory satisfy useful reasoning principles that are not preserved under homomorphisms of algebras, and hence need not hold in an arbitrary algebra. For instance, if $M$ is the free monoid on a set $A$, then the scalar multiplication function $A\times M \to M$ is injective. Therefore, when reasoning in the formal theory of monoids under $A$, it is possible to use this injectivity law to make sound deductions even about monoids under $A$ for which scalar multiplication is not injective -- a principle known in algebra as the permanence of identity. Properties of this kind are of fundamental practical importance to the logicians and computer scientists who design and implement computerized proof assistants like Lean and Coq, as they enable the formal reductions of equational problems that make type checking tractable. As type theories have become increasingly more sophisticated, it has become more and more difficult to establish the useful properties of their free models that enable effective implementation. These obstructions have facilitated a fruitful return to foundational work in type theory, which has taken on a more geometrical flavor than ever before. Here we expose a modern way to prove a highly non-trivial injectivity law for free models of Martin-L\"of type theory, paying special attention to the ways that contemporary methods in type theory have been influenced by three important ideas of the Grothendieck school: the relative point of view, the language of universes, and the recollement of generalized spaces.
We study the multivariate deconvolution problem of recovering the distribution of a signal from independent and identically distributed observations additively contaminated with random errors (noise) from a known distribution. For errors with independent coordinates having ordinary smooth densities, we derive an inversion inequality relating the $L^1$-Wasserstein distance between two distributions of the signal to the $L^1$-distance between the corresponding mixture densities of the observations. This smoothing inequality outperforms existing inversion inequalities. As an application of the inversion inequality to the Bayesian framework, we consider $1$-Wasserstein deconvolution with Laplace noise in dimension one using a Dirichlet process mixture of normal densities as a prior measure on the mixing distribution (or distribution of the signal). We construct an adaptive approximation of the sampling density by convolving the Laplace density with a well-chosen mixture of normal densities and show that the posterior measure concentrates around the sampling density at a nearly minimax rate, up to a log-factor, in the $L^1$-distance. The same posterior law is also shown to automatically adapt to the unknown Sobolev regularity of the mixing density, thus leading to a new Bayesian adaptive estimation procedure for mixing distributions with regular densities under the $L^1$-Wasserstein metric. We illustrate utility of the inversion inequality also in a frequentist setting by showing that an appropriate isotone approximation of the classical kernel deconvolution estimator attains the minimax rate of convergence for $1$-Wasserstein deconvolution in any dimension $d\geq 1$, when only a tail condition is required on the latent mixing density and we derive sharp lower bounds for these problems