We study and develop multilevel methods for the numerical approximation of a log-concave probability $\pi$ on $\mathbb{R}^d$, based on (over-damped) Langevin diffusion. In the continuity of \cite{art:egeapanloup2021multilevel} concentrated on the uniformly log-concave setting, we here study the procedure in the absence of the uniformity assumption. More precisely, we first adapt an idea of \cite{art:DalalyanRiouKaragulyan} by adding a penalization term to the potential to recover the uniformly convex setting. Such approach leads to an \textit{$\varepsilon$-complexity} of the order $\varepsilon^{-5} \pi(|.|^2)^{3} d$ (up to logarithmic terms). Then, in the spirit of \cite{art:gadat2020cost}, we propose to explore the robustness of the method in a weakly convex parametric setting where the lowest eigenvalue of the Hessian of the potential $U$ is controlled by the function $U(x)^{-r}$ for $r \in (0,1)$. In this intermediary framework between the strongly convex setting ($r=0$) and the ``Laplace case'' ($r=1$), we show that with the help of the control of exponential moments of the Euler scheme, we can adapt some fundamental properties for the efficiency of the method. In the ``best'' setting where $U$ is ${\mathcal{C}}^3$ and $U(x)^{-r}$ control the largest eigenvalue of the Hessian, we obtain an $\varepsilon$-complexity of the order $c_{\rho,\delta}\varepsilon^{-2-\rho} d^{1+\frac{\rho}{2}+(4-\rho+\delta) r}$ for any $\rho>0$ (but with a constant $c_{\rho,\delta}$ which increases when $\rho$ and $\delta$ go to $0$).
This paper presents a clustering technique that reduces the susceptibility to data noise by learning and clustering the data-distribution and then assigning the data to the cluster of its distribution. In the process, it reduces the impact of noise on clustering results. This method involves introducing a new distance among distributions, namely the expectation distance (denoted, ED), that goes beyond the state-of-art distribution distance of optimal mass transport (denoted, $W_2$ for $2$-Wasserstein): The latter essentially depends only on the marginal distributions while the former also employs the information about the joint distributions. Using the ED, the paper extends the classical $K$-means and $K$-medoids clustering to those over data-distributions (rather than raw-data) and introduces $K$-medoids using $W_2$. The paper also presents the closed-form expressions of the $W_2$ and ED distance measures. The implementation results of the proposed ED and the $W_2$ distance measures to cluster real-world weather data as well as stock data are also presented, which involves efficiently extracting and using the underlying data distributions -- Gaussians for weather data versus lognormals for stock data. The results show striking performance improvement over classical clustering of raw-data, with higher accuracy realized for ED. Also, not only does the distribution-based clustering offer higher accuracy, but it also lowers the computation time due to reduced time-complexity.
We investigate the problem of bandits with expert advice when the experts are fixed and known distributions over the actions. Improving on previous analyses, we show that the regret in this setting is controlled by information-theoretic quantities that measure the similarity between experts. In some natural special cases, this allows us to obtain the first regret bound for EXP4 that can get arbitrarily close to zero if the experts are similar enough. While for a different algorithm, we provide another bound that describes the similarity between the experts in terms of the KL-divergence, and we show that this bound can be smaller than the one of EXP4 in some cases. Additionally, we provide lower bounds for certain classes of experts showing that the algorithms we analyzed are nearly optimal in some cases.
In this paper, we compare three different model-based risk measures by evaluating their stengths and weaknesses qualitatively and testing them quantitatively on a set of real longitudinal and intersection scenarios. We start with the traditional heuristic Time-To-Collision (TTC), which we extend towards 2D operation and non-crash cases to retrieve the Time-To-Closest-Encounter (TTCE). The second risk measure models position uncertainty with a Gaussian distribution and uses spatial occupancy probabilities for collision risks. We then derive a novel risk measure based on the statistics of sparse critical events and so-called survival conditions. The resulting survival analysis shows to have an earlier detection time of crashes and less false positive detections in near-crash and non-crash cases supported by its solid theoretical grounding. It can be seen as a generalization of TTCE and the Gaussian method which is suitable for the validation of ADAS and AD.
Bayes factors for composite hypotheses have difficulty in encoding vague prior knowledge, as improper priors cannot be used and objective priors may be subjectively unreasonable. To address these issues we revisit the posterior Bayes factor, in which the posterior distribution from the data at hand is re-used in the Bayes factor for the same data. We argue that this is biased when calibrated against proper Bayes factors, but propose adjustments to allow interpretation on the same scale. In the important case of a regular normal model, the bias in log scale is half the number of parameters. The resulting empirical Bayes factor is closely related to the widely applicable information criterion. We develop test-based empirical Bayes factors for several standard tests and propose an extension to multiple testing closely related to the optimal discovery procedure. For non-parametric tests the empirical Bayes factor is approximately 10 times the P-value. We propose interpreting the strength of Bayes factors on a logarithmic scale with base 3.73, reflecting the sharpest distinction between weaker and stronger belief. This provides an objective framework for interpreting statistical evidence, realising a Bayesian/frequentist compromise.
This work sheds some light on the relationship between a distribution's standard deviation and its range, a topic that has been discussed extensively in the literature. While many previous studies have proposed inequalities or relationships that depend on the shape of the population distribution, the approach here is built on a family of bounded probability distributions based on skewing functions. We offer closed-form expressions for its moments and the asymptotic behavior as the support's semi-range tends to zero and $\infty$. We also establish an inequality in which the well-known Popoviciu's one is a special case. Finally, we provide an example using US dollar prices in four different currencies traded on foreign exchange markets to illustrate the results developed here.
The geometric high-order regularization methods such as mean curvature and Gaussian curvature, have been intensively studied during the last decades due to their abilities in preserving geometric properties including image edges, corners, and contrast. However, the dilemma between restoration quality and computational efficiency is an essential roadblock for high-order methods. In this paper, we propose fast multi-grid algorithms for minimizing both mean curvature and Gaussian curvature energy functionals without sacrificing accuracy for efficiency. Unlike the existing approaches based on operator splitting and the Augmented Lagrangian method (ALM), no artificial parameters are introduced in our formulation, which guarantees the robustness of the proposed algorithm. Meanwhile, we adopt the domain decomposition method to promote parallel computing and use the fine-to-coarse structure to accelerate convergence. Numerical experiments are presented on image denoising, CT, and MRI reconstruction problems to demonstrate the superiority of our method in preserving geometric structures and fine details. The proposed method is also shown effective in dealing with large-scale image processing problems by recovering an image of size $1024\times 1024$ within $40$s, while the ALM method requires around $200$s.
Forward simulation-based uncertainty quantification that studies the output distribution of quantities of interest (QoI) is a crucial component for computationally robust statistics and engineering. There is a large body of literature devoted to accurately assessing statistics of QoI, and in particular, multilevel or multifidelity approaches are known to be effective, leveraging cost-accuracy tradeoffs between a given ensemble of models. However, effective algorithms that can estimate the full distribution of outputs are still under active development. In this paper, we introduce a general multifidelity framework for estimating the cumulative distribution functions (CDFs) of vector-valued QoI associated with a high-fidelity model under a budget constraint. Given a family of appropriate control variates obtained from lower fidelity surrogates, our framework involves identifying the most cost-effective model subset and then using it to build an approximate control variates estimator for the target CDF. We instantiate the framework by constructing a family of control variates using intermediate linear approximators and rigorously analyze the corresponding algorithm. Our analysis reveals that the resulting CDF estimator is uniformly consistent and budget-asymptotically optimal, with only mild moment and regularity assumptions. The approach provides a robust multifidelity CDF estimator that is adaptive to the available budget, does not require \textit{a priori} knowledge of cross-model statistics or model hierarchy, and is applicable to general output dimensions. We demonstrate the efficiency and robustness of the approach using several test examples.
Rather than refining individual candidate solutions for a general non-convex optimization problem, by analogy to evolution, we consider minimizing the average loss for a parametric distribution over hypotheses. In this setting, we prove that Fisher-Rao natural gradient descent (FR-NGD) optimally approximates the continuous-time replicator equation (an essential model of evolutionary dynamics) by minimizing the mean-squared error for the relative fitness of competing hypotheses. We term this finding "conjugate natural selection" and demonstrate its utility by numerically solving an example non-convex optimization problem over a continuous strategy space. Next, by developing known connections between discrete-time replicator dynamics and Bayes's rule, we show that when absolute fitness corresponds to the negative KL-divergence of a hypothesis's predictions from actual observations, FR-NGD provides the optimal approximation of continuous Bayesian inference. We use this result to demonstrate a novel method for estimating the parameters of stochastic processes.
We introduce weak barycenters of a family of probability distributions, based on the recently developed notion of optimal weak transport of mass by Gozlanet al. (2017) and Backhoff-Veraguas et al. (2020). We provide a theoretical analysis of this object and discuss its interpretation in the light of convex ordering between probability measures. In particular, we show that, rather than averaging the input distributions in a geometric way (as the Wasserstein barycenter based on classic optimal transport does) weak barycenters extract common geometric information shared by all the input distributions, encoded as a latent random variable that underlies all of them. We also provide an iterative algorithm to compute a weak barycenter for a finite family of input distributions, and a stochastic algorithm that computes them for arbitrary populations of laws. The latter approach is particularly well suited for the streaming setting, i.e., when distributions are observed sequentially. The notion of weak barycenter and our approaches to compute it are illustrated on synthetic examples, validated on 2D real-world data and compared to standard Wasserstein barycenters.
This study developed a new statistical model and method for analyzing the precision of binary measurement methods from collaborative studies. The model is based on beta-binomial distributions. In other words, it assumes that the sensitivity of each laboratory obeys a beta distribution, and the binary measured values under a given sensitivity follow a binomial distribution. We propose the key precision measures of repeatability and reproducibility for the model, and provide their unbiased estimates. Further, through consideration of a number of statistical test methods for homogeneity of proportions, we propose appropriate methods for determining laboratory effects in the new model. Finally, we apply the results to real-world examples in the fields of food safety and chemical risk assessment and management.