97SE亚洲国产综合在线_亚洲日本文字天天更新_亚洲A中文无码字幕色下载软件_久久99国产精品久久久久久久不卡_亚洲人成网站色在线_国产永久精品一区二区污污_国产第一区二区三区精品

We detail an approach to develop Stein's method for bounding integral metrics on probability measures defined on a Riemannian manifold $\mathbf M$. Our approach exploits the relationship between the generator of a diffusion on $\mathbf M$ with target invariant measure and its characterising Stein operator. We consider a pair of such diffusions with different starting points, and through analysis of the distance process between the pair, derive Stein factors, which bound the solution to the Stein equation and its derivatives. The Stein factors contain curvature-dependent terms and reduce to those currently available for $\mathbb R^m$, and moreover imply that the bounds for $\mathbb R^m$ remain valid when $\mathbf M$ is a flat manifold

相關內容

流形

關注 3

有限差分 · 模型評估 · MATLAB · 確切的 · 有向 ·

2023 年 6 月 12 日

An Upwind Finite Difference Method to Singularly Perturbed Convection Diffusion Problems on a Shishkin Mesh

Daniel T. Gregory

from arxiv, 19 pages, 4 figures

This paper introduces a numerical approach to solve singularly perturbed convection diffusion boundary value problems for second-order ordinary differential equations that feature a small positive parameter {\epsilon} multiplying the highest derivative. We specifically examine Dirichlet boundary conditions. To solve this differential equation, we propose an upwind finite difference method and incorporate the Shishkin mesh scheme to capture the solution near boundary layers. Our solver is both direct and of high accuracy, with computation time that scales linearly with the number of grid points. MATLAB code of the numerical recipe is made publicly available. We present numerical results to validate the theoretical results and assess the accuracy of our method. The tables and graphs included in this paper demonstrate the numerical outcomes, which indicate that our proposed method offers a highly accurate approximation of the exact solution.

近似 · Networking · Neural Networks · Performer · 樣本 ·

2023 年 6 月 12 日

Riemannian Laplace approximations for Bayesian neural networks

Federico Bergamin,Pablo Moreno-Mu?oz,S?ren Hauberg,Georgios Arvanitidis

from arxiv, 28 pages, 12 figures. Under submission

Bayesian neural networks often approximate the weight-posterior with a Gaussian distribution. However, practical posteriors are often, even locally, highly non-Gaussian, and empirical performance deteriorates. We propose a simple parametric approximate posterior that adapts to the shape of the true posterior through a Riemannian metric that is determined by the log-posterior gradient. We develop a Riemannian Laplace approximation where samples naturally fall into weight-regions with low negative log-posterior. We show that these samples can be drawn by solving a system of ordinary differential equations, which can be done efficiently by leveraging the structure of the Riemannian metric and automatic differentiation. Empirically, we demonstrate that our approach consistently improves over the conventional Laplace approximation across tasks. We further show that, unlike the conventional Laplace approximation, our method is not overly sensitive to the choice of prior, which alleviates a practical pitfall of current approaches.

拒絕采樣 · 貪心 · 樣本 · 通道 · 單峰值 ·

2023 年 6 月 11 日

Greedy Poisson Rejection Sampling

Gergely Flamich

from arxiv, 30 pages, 2 figures. V2: Fixed a typo in Section 3 (was using $\tau$ instead of $t$ as a variable). Added missing Laplace-Laplace calculations in Appendix G.5

One-shot channel simulation is a fundamental data compression problem concerned with encoding a single sample from a target distribution $Q$ using a coding distribution $P$ using as few bits as possible on average. Algorithms that solve this problem find applications in neural data compression and differential privacy and can serve as a more efficient alternative to quantization-based methods. Sadly, existing solutions are too slow or have limited applicability, preventing widespread adoption. In this paper, we conclusively solve one-shot channel simulation for one-dimensional problems where the target-proposal density ratio is unimodal by describing an algorithm with optimal runtime. We achieve this by constructing a rejection sampling procedure equivalent to greedily searching over the points of a Poisson process. Hence, we call our algorithm greedy Poisson rejection sampling (GPRS) and analyze the correctness and time complexity of several of its variants. Finally, we empirically verify our theorems, demonstrating that GPRS significantly outperforms the current state-of-the-art method, A* coding.

動量 · 小批量 · FAST · 動量法 · 狀態轉移矩陣 ·

2023 年 6 月 10 日

On the fast convergence of minibatch heavy ball momentum

Raghu Bollapragada,Tyler Chen,Rachel Ward

Simple stochastic momentum methods are widely used in machine learning optimization, but their good practical performance is at odds with an absence of theoretical guarantees of acceleration in the literature. In this work, we aim to close the gap between theory and practice by showing that stochastic heavy ball momentum retains the fast linear rate of (deterministic) heavy ball momentum on quadratic optimization problems, at least when minibatching with a sufficiently large batch size. The algorithm we study can be interpreted as an accelerated randomized Kaczmarz algorithm with minibatching and heavy ball momentum. The analysis relies on carefully decomposing the momentum transition matrix, and using new spectral norm concentration bounds for products of independent random matrices. We provide numerical illustrations demonstrating that our bounds are reasonably sharp.

近似 · 論文 · 幾乎必然收斂 · 幾乎必然 · 駐點 ·

2023 年 6 月 10 日

Convergence of Momentum-Based Heavy Ball Method with Batch Updating and/or Approximate Gradients

Tadipatri Uday Kiran Reddy,Mathukumalli Vidyasagar

from arxiv, 33 pages, 6 figures

In this paper, we study the well-known "Heavy Ball" method for convex and nonconvex optimization introduced by Polyak in 1964, and establish its convergence under a variety of situations. Traditionally, most algorithms use "full-coordinate update," that is, at each step, every component of the argument is updated. However, when the dimension of the argument is very high, it is more efficient to update some but not all components of the argument at each iteration. We refer to this as "batch updating" in this paper. When gradient-based algorithms are used together with batch updating, in principle it is sufficient to compute only those components of the gradient for which the argument is to be updated. However, if a method such as backpropagation is used to compute these components, computing only some components of gradient does not offer much savings over computing the entire gradient. Therefore, to achieve a noticeable reduction in CPU usage at each step, one can use first-order differences to approximate the gradient. The resulting estimates are biased, and also have unbounded variance. Thus some delicate analysis is required to ensure that the HB algorithm converge when batch updating is used instead of full-coordinate updating, and/or approximate gradients are used instead of true gradients. In this paper, we establish the almost sure convergence of the iterations to the stationary point(s) of the objective function under suitable conditions; in addition, we also derive upper bounds on the rate of convergence. To the best of our knowledge, there is no other paper that combines all of these features. This paper is dedicated to the memory of Boris Teodorovich Polyak

泛函 · 頻率主義學派 · state-of-the-art · 自助法/自舉法 · 正則的 ·

2023 年 6 月 9 日

Semiparametric posterior corrections

Andrew Yiu,Edwin Fong,Chris Holmes,Judith Rousseau

from arxiv, 53 pages

We present a new approach to semiparametric inference using corrected posterior distributions. The method allows us to leverage the adaptivity, regularization and predictive power of nonparametric Bayesian procedures to estimate low-dimensional functionals of interest without being restricted by the holistic Bayesian formalism. Starting from a conventional nonparametric posterior, we target the functional of interest by transforming the entire distribution with a Bayesian bootstrap correction. We provide conditions for the resulting $\textit{one-step posterior}$ to possess calibrated frequentist properties and specialize the results for several canonical examples: the integrated squared density, the mean of a missing-at-random outcome, and the average causal treatment effect on the treated. The procedure is computationally attractive, requiring only a simple, efficient post-processing step that can be attached onto any arbitrary posterior sampling algorithm. Using the ACIC 2016 causal data analysis competition, we illustrate that our approach can outperform the existing state-of-the-art through the propagation of Bayesian uncertainty.

離散化 · 噪聲 · 標準差 · MoDELS · 均值 ·

2023 年 6 月 9 日

A Bayesian Approach to Modeling Finite Element Discretization Error

Anne Poot,Pierre Kerfriden,Iuri Rocha,Frans van der Meer

In recent years, there has been a surge of interest in the development of probabilistic approaches to problems that might appear to be purely deterministic. One example of this is the solving of partial differential equations. Since numerical solvers require some approximation of the infinite-dimensional solution space, there is an inherent uncertainty to the solution that is obtained. In this work, the uncertainty associated with the finite element discretization error is modeled following the Bayesian paradigm. First, a continuous formulation is derived, where a Gaussian process prior over the solution space is updated based on observations from a finite element discretization. Due to intractable integrals, a second, finer, discretization is introduced that is assumed sufficiently dense to represent the true solution field. The prior distribution assumed over the fine discretization is then updated based on observations from the coarse discretization. This yields a posterior distribution with a mean close to the deterministic fine-scale solution that is endowed with an uncertainty measure. The prior distribution over the solution space is defined implicitly by assigning a white noise distribution to the right-hand side. This allows for a sparse representation of the prior distribution, and guarantees that the prior samples have the appropriate level of smoothness for the problem at hand. Special attention is paid to inhomogeneous Dirichlet and Neumann boundary conditions, and how these can be used to enhance this white noise prior distribution. For various problems, we demonstrate how regions of large discretization error are captured in the structure of the posterior standard deviation. The effects of the hyperparameters and observation noise on the quality of the posterior mean and standard deviation are investigated in detail.

方陣 · 線性的 · 無偏估計 · 坐標下降 · motivation ·

2023 年 6 月 9 日

Linearly convergent adjoint free solution of least squares problems by random descent

Dirk A. Lorenz,Felix Schneppe,Lionel Tondji

We consider the problem of solving linear least squares problems in a framework where only evaluations of the linear map are possible. We derive randomized methods that do not need any other matrix operations than forward evaluations, especially no evaluation of the adjoint map is needed. Our method is motivated by the simple observation that one can get an unbiased estimate of the application of the adjoint. We show convergence of the method and then derive a more efficient method that uses an exact linesearch. This method, called random descent, resembles known methods in other context and has the randomized coordinate descent method as special case. We provide convergence analysis of the random descent method emphasizing the dependence on the underlying distribution of the random vectors. Furthermore we investigate the applicability of the method in the context of ill-posed inverse problems and show that the method can have beneficial properties when the unknown solution is rough. We illustrate the theoretical findings in numerical examples. One particular result is that the random descent method actually outperforms established transposed-free methods (TFQMR and CGS) in examples.

binary · Machine Learning · Learning · 可辨認的 · 估計誤差 ·

2023 年 6 月 9 日

A reduced-rank approach to predicting multiple binary responses through machine learning

The Tien Mai

This paper investigates the problem of simultaneously predicting multiple binary responses by utilizing a shared set of covariates. Our approach incorporates machine learning techniques for binary classification, without making assumptions about the underlying observations. Instead, our focus lies on a group of predictors, aiming to identify the one that minimizes prediction error. Unlike previous studies that primarily address estimation error, we directly analyze the prediction error of our method using PAC-Bayesian bounds techniques. In this paper, we introduce a pseudo-Bayesian approach capable of handling incomplete response data. Our strategy is efficiently implemented using the Langevin Monte Carlo method. Through simulation studies and a practical application using real data, we demonstrate the effectiveness of our proposed method, producing comparable or sometimes superior results compared to the current state-of-the-art method.

Guidance · MoDELS · 得分 · Learning · 廣義得分匹配 ·

2023 年 6 月 8 日

Reflected Diffusion Models

Aaron Lou,Stefano Ermon

from arxiv, ICML 2023 Camera Ready. Code available at //github.com/louaaron/Reflected-Diffusion

Score-based diffusion models learn to reverse a stochastic differential equation that maps data to noise. However, for complex tasks, numerical error can compound and result in highly unnatural samples. Previous work mitigates this drift with thresholding, which projects to the natural data domain (such as pixel space for images) after each diffusion step, but this leads to a mismatch between the training and generative processes. To incorporate data constraints in a principled manner, we present Reflected Diffusion Models, which instead reverse a reflected stochastic differential equation evolving on the support of the data. Our approach learns the perturbed score function through a generalized score matching loss and extends key components of standard diffusion models including diffusion guidance, likelihood-based training, and ODE sampling. We also bridge the theoretical gap with thresholding: such schemes are just discretizations of reflected SDEs. On standard image benchmarks, our method is competitive with or surpasses the state of the art without architectural modifications and, for classifier-free guidance, our approach enables fast exact sampling with ODEs and produces more faithful samples under high guidance weight.