Density power divergence (DPD) [Basu et al. (1998), Biometrika], designed to estimate the underlying distribution of the observations robustly, comprises an integral term of the power of the parametric density models to be estimated. While the explicit form of the integral term can be obtained for some specific densities (such as normal density and exponential density), its computational intractability has prohibited the application of DPD-based estimation to more general parametric densities, over a quarter of a century since the proposal of DPD. This study proposes a stochastic optimization approach to minimize DPD for general parametric density models and explains its adequacy by referring to conventional theories on stochastic optimization. The proposed approach also can be applied to the minimization of another density power-based $\gamma$-divergence with the aid of unnormalized models [Kanamori and Fujisawa (2015), Biometrika].
One of the central quantities of probabilistic seismic risk assessment studies is the fragility curve, which represents the probability of failure of a mechanical structure conditional to a scalar measure derived from the seismic ground motion. Estimating such curves is a difficult task because for most structures of interest, few data are available. For this reason, a wide range of the methods of the literature rely on a parametric log-normal model. Bayesian approaches allow for efficient learning of the model parameters. However, the choice of the prior distribution has a non-negligible influence on the posterior distribution, and therefore on any resulting estimate. We propose a thorough study of this parametric Bayesian estimation problem when the data are binary (i.e. data indicate the state of the structure, failure or non-failure). Using the reference prior theory as a support, we suggest an objective approach for the prior choice. This approach leads to the Jeffreys' prior which is explicitly derived for this problem for the first time. The posterior distribution is proven to be proper (i.e. it integrates to unity) with Jeffreys' prior and improper with some classical priors from the literature. The posterior distribution with Jeffreys' prior is also shown to vanish at the boundaries of the parameter domain, so sampling of the posterior distribution of the parameters does not produce anomalously small or large values, which in turn does not produce degenerate fragility curves such as unit step functions. The numerical results on three different case studies illustrate these theoretical predictions.
Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations which jointly explain multiple modalities. Various objective functions for such models have been suggested, often motivated as lower bounds on the multi-modal data log-likelihood or from information-theoretic considerations. In order to encode latent variables from different modality subsets, Product-of-Experts (PoE) or Mixture-of-Experts (MoE) aggregation schemes have been routinely used and shown to yield different trade-offs, for instance, regarding their generative quality or consistency across multiple modalities. In this work, we consider a variational bound that can tightly lower bound the data log-likelihood. We develop more flexible aggregation schemes that generalise PoE or MoE approaches by combining encoded features from different modalities based on permutation-invariant neural networks. Our numerical experiments illustrate trade-offs for multi-modal variational bounds and various aggregation schemes. We show that tighter variational bounds and more flexible aggregation models can become beneficial when one wants to approximate the true joint distribution over observed modalities and latent variables in identifiable models.
This study focuses on the presence of (multi)fractal structures in confined hadronic matter through the momentum distributions of mesons produced in proton-proton collisions between 23 GeV and 63 GeV. The analysis demonstrates that the $q$-exponential behaviour of the particle momentum distributions is consistent with fractal characteristics, exhibiting fractal structures in confined hadronic matter with features similar to those observed in the deconfined quark-gluon plasma (QGP) regime. Furthermore, the systematic analysis of meson production in hadronic collisions at energies below 1 TeV suggests that specific fractal parameters are universal, independently of confinement or deconfinement, while others may be influenced by the quark content of the produced meson. These results pave the way for further research exploring the implications of fractal structures on various physical distributions and offer insights into the nature of the phase transition between confined and deconfined regimes.
We define some Schnyder-type combinatorial structures on a class of planar triangulations of the pentagon which are closely related to 5-connected triangulations. The combinatorial structures have three incarnations defined in terms of orientations, corner-labelings, and woods respectively. The wood incarnation consists in 5 spanning trees crossing each other in an orderly fashion. Similarly as for Schnyder woods on triangulations, it induces, for each vertex, a partition of the inner triangles into face-connected regions (5~regions here). We show that the induced barycentric vertex-placement, where each vertex is at the barycenter of the 5 outer vertices with weights given by the number of faces in each region, yields a planar straight-line drawing.
Quantum channel capacity is a fundamental quantity in order to understand how good can quantum information be transmitted or corrected when subjected to noise. However, it is generally not known how to compute such quantities, since the quantum channel coherent information is not additive for all channels, implying that it must be maximized over an unbounded number of channel uses. This leads to the phenomenon known as superadditivity, which refers to the fact that the regularized coherent information of $n$ channel uses exceeds one-shot coherent information. In this article, we study how the gain in quantum capacity of qudit depolarizing channels relates to the dimension of the systems considered. We make use of an argument based on the no-cloning bound in order to proof that the possible superadditive effects decrease as a function of the dimension for such family of channels. In addition, we prove that the capacity of the qudit depolarizing channel coincides with the coherent information when $d\rightarrow\infty$. We also discuss the private classical capacity and obain similar results. We conclude that when high dimensional qudits experiencing depolarizing noise are considered, the coherent information of the channel is not only an achievable rate but essentially the maximum possible rate for any quantum block code.
We combine the unbiased estimators in Rhee and Glynn (Operations Research: 63(5), 1026-1043, 2015) and the Heston model with stochastic interest rates. Specifically, we first develop a semi-exact log-Euler scheme for the Heston model with stochastic interest rates. Then, under mild assumptions, we show that the convergence rate in the $L^2$ norm is $O(h)$, where $h$ is the step size. The result applies to a large class of models, such as the Heston-Hull-While model, the Heston-CIR model and the Heston-Black-Karasinski model. Numerical experiments support our theoretical convergence rate.
We present a multigrid algorithm to solve efficiently the large saddle-point systems of equations that typically arise in PDE-constrained optimization under uncertainty. The algorithm is based on a collective smoother that at each iteration sweeps over the nodes of the computational mesh, and solves a reduced saddle-point system whose size depends on the number $N$ of samples used to discretized the probability space. We show that this reduced system can be solved with optimal $O(N)$ complexity. We test the multigrid method on three problems: a linear-quadratic problem for which the multigrid method is used to solve directly the linear optimality system; a nonsmooth problem with box constraints and $L^1$-norm penalization on the control, in which the multigrid scheme is used within a semismooth Newton iteration; a risk-adverse problem with the smoothed CVaR risk measure where the multigrid method is called within a preconditioned Newton iteration. In all cases, the multigrid algorithm exhibits very good performances and robustness with respect to all parameters of interest.
In genetic studies, haplotype data provide more refined information than data about separate genetic markers. However, large-scale studies that genotype hundreds to thousands of individuals may only provide results of pooled data, where only the total allele counts of each marker in each pool are reported. Methods for inferring haplotype frequencies from pooled genetic data that scale well with pool size rely on a normal approximation, which we observe to produce unreliable inference when applied to real data. We illustrate cases where the approximation breaks down, due to the normal covariance matrix being near-singular. As an alternative to approximate methods, in this paper we propose exact methods to infer haplotype frequencies from pooled genetic data based on a latent multinomial model, where the observed allele counts are considered integer combinations of latent, unobserved haplotype counts. One of our methods, latent count sampling via Markov bases, achieves approximately linear runtime with respect to pool size. Our exact methods produce more accurate inference over existing approximate methods for synthetic data and for data based on haplotype information from the 1000 Genomes Project. We also demonstrate how our methods can be applied to time-series of pooled genetic data, as a proof of concept of how our methods are relevant to more complex hierarchical settings, such as spatiotemporal models.
We investigate a class of parametric elliptic eigenvalue problems with homogeneous essential boundary conditions where the coefficients (and hence the solution $u$) may depend on a parameter $y$. For the efficient approximate evaluation of parameter sensitivities of the first eigenpairs on the entire parameter space we propose and analyse Gevrey class and analytic regularity of the solution with respect to the parameters. This is made possible by a novel proof technique which we introduce and demonstrate in this paper. Our regularity result has immediate implications for convergence of various numerical schemes for parametric elliptic eigenvalue problems, in particular, for elliptic eigenvalue problems with infinitely many parameters arising from elliptic differential operators with random coefficients.
We present a framework for the multiscale modeling of finite strain magneto-elasticity based on physics-augmented neural networks (NNs). By using a set of problem specific invariants as input, an energy functional as the output and by adding several non-trainable expressions to the overall total energy density functional, the model fulfills multiple physical principles by construction, e.g., thermodynamic consistency and material symmetry. Three NN-based models with varying requirements in terms of an extended polyconvexity condition of the magneto-elastic potential are presented. First, polyconvexity, which is a global concept, is enforced via input convex neural networks (ICNNs). Afterwards, we formulate a relaxed local version of the polyconvexity and fulfill it in a weak sense by adding a tailored loss term. As an alternative, a loss term to enforce the weaker requirement of strong ellipticity locally is proposed, which can be favorable to obtain a better trade-off between compatibility with data and physical constraints. Databases for training of the models are generated via computational homogenization for both compressible and quasi-incompressible magneto-active polymers (MAPs). Thereby, to reduce the computational cost, 2D statistical volume elements and an invariant-based sampling technique for the pre-selection of relevant states are used. All models are calibrated by using the database, whereby interpolation and extrapolation are considered separately. Furthermore, the performance of the NN models is compared to a conventional model from the literature. The numerical study suggests that the proposed physics-augmented NN approach is advantageous over the conventional model for MAPs. Thereby, the two more flexible NN models in combination with the weakly enforced local polyconvexity lead to good results, whereas the model based only on ICNNs has proven to be too restrictive.