男女一边脱一边亲一边膜_久久91超碰色中文字幕总站_最新亚洲日韩一级_国产欧美一级A在线观看_欧美色伦欧美一区二区日韩_伊人久久精品无码AV一区二区三区_国产又粗又猛又大的视频

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.

相關內容

通用化

關注 0

Integration · 可約的 · 雅克比 · Performer · 模型評估 ·

2023 年 5 月 24 日

Probabilistic Exponential Integrators

Nathanael Bosch,Philipp Hennig,Filip Tronarp

Probabilistic solvers provide a flexible and efficient framework for simulation, uncertainty quantification, and inference in dynamical systems. However, like standard solvers, they suffer performance penalties for certain stiff systems, where small steps are required not for reasons of numerical accuracy but for the sake of stability. This issue is greatly alleviated in semi-linear problems by the probabilistic exponential integrators developed in this paper. By including the fast, linear dynamics in the prior, we arrive at a class of probabilistic integrators with favorable properties. Namely, they are proven to be L-stable, and in a certain case reduce to a classic exponential integrator -- with the added benefit of providing a probabilistic account of the numerical error. The method is also generalized to arbitrary non-linear systems by imposing piece-wise semi-linearity on the prior via Jacobians of the vector field at the previous estimates, resulting in probabilistic exponential Rosenbrock methods. We evaluate the proposed methods on multiple stiff differential equations and demonstrate their improved stability and efficiency over established probabilistic solvers. The present contribution thus expands the range of problems that can be effectively tackled within probabilistic numerics.

有向非循環圖 · DAG · 有向 · 圖 · 對抗學習 ·

2023 年 5 月 23 日

Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning

Chengchun Shi,Yunzhe Zhou,Lexin Li

In this article, we propose a new hypothesis testing method for directed acyclic graph (DAG). While there is a rich class of DAG estimation methods, there is a relative paucity of DAG inference solutions. Moreover, the existing methods often impose some specific model structures such as linear models or additive models, and assume independent data observations. Our proposed test instead allows the associations among the random variables to be nonlinear and the data to be time-dependent. We build the test based on some highly flexible neural networks learners. We establish the asymptotic guarantees of the test, while allowing either the number of subjects or the number of time points for each subject to diverge to infinity. We demonstrate the efficacy of the test through simulations and a brain connectivity network analysis.

可理解性 · 高斯分布 · 核化 · CASES · 樣本 ·

2023 年 5 月 23 日

Towards Understanding the Dynamics of Gaussian--Stein Variational Gradient Descent

Tianle Liu,Promit Ghosal,Krishnakumar Balasubramanian,Natesh Pillai

from arxiv, 57 pages, 6 figures

Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem. For sampling from a Gaussian target, the SVGD dynamics with a bilinear kernel will remain Gaussian as long as the initializer is Gaussian. Inspired by this fact, we undertake a detailed theoretical study of the Gaussian-SVGD, i.e., SVGD projected to the family of Gaussian distributions via the bilinear kernel, or equivalently Gaussian variational inference (GVI) with SVGD. We present a complete picture by considering both the mean-field PDE and discrete particle systems. When the target is strongly log-concave, the mean-field Gaussian-SVGD dynamics is proven to converge linearly to the Gaussian distribution closest to the target in KL divergence. In the finite-particle setting, there is both uniform in time convergence to the mean-field limit and linear convergence in time to the equilibrium if the target is Gaussian. In the general case, we propose a density-based and a particle-based implementation of the Gaussian-SVGD, and show that several recent algorithms for GVI, proposed from different perspectives, emerge as special cases of our unified framework. Interestingly, one of the new particle-based instance from this framework empirically outperforms existing approaches. Our results make concrete contributions towards obtaining a deeper understanding of both SVGD and GVI.

縮放 · 線性的 · 有限差分 · CASES · ENJOY ·

2023 年 5 月 22 日

High order asymptotic preserving scheme for linear kinetic equations with diffusive scaling

Megala Anandan,Benjamin Boutin,Nicolas Crouseilles

In this work, high order asymptotic preserving schemes are constructed and analysed for kinetic equations under a diffusive scaling. The framework enables to consider different cases: the diffusion equation, the advection-diffusion equation and the presence of inflow boundary conditions. Starting from the micro-macro reformulation of the original kinetic equation, high order time integrators are introduced. This class of numerical schemes enjoys the Asymptotic Preserving (AP) property for arbitrary initial data and degenerates when $\epsilon$ goes to zero into a high order scheme which is implicit for the diffusion term, which makes it free from the usual diffusion stability condition. The space discretization is also discussed and high order methods are also proposed based on classical finite differences schemes. The Asymptotic Preserving property is analysed and numerical results are presented to illustrate the properties of the proposed schemes in different regimes.

Continuity · 估計/估計量 · 均值 · 正則化項 · 離散化 ·

2023 年 5 月 22 日

Error estimates of a theta-scheme for second-order mean field games

J. Frédéric Bonnans,Kang Liu,Laurent Pfeiffer

from arxiv, 35 pages

We introduce and analyze a new finite-difference scheme, relying on the theta-method, for solving monotone second-order mean field games. These games consist of a coupled system of the Fokker-Planck and the Hamilton-Jacobi-Bellman equation. The theta-method is used for discretizing the diffusion terms: we approximate them with a convex combination of an implicit and an explicit term. On contrast, we use an explicit centered scheme for the first-order terms. Assuming that the running cost is strongly convex and regular, we first prove the monotonicity and the stability of our theta-scheme, under a CFL condition. Taking advantage of the regularity of the solution of the continuous problem, we estimate the consistency error of the theta-scheme. Our main result is a convergence rate of order $\mathcal{O}(h^r)$ for the theta-scheme, where $h$ is the step length of the space variable and $r \in (0,1)$ is related to the H\"older continuity of the solution of the continuous problem and some of its derivatives.

Minimax · 線性的 · 優化器 · 泛函 · 樣本復雜度 ·

2023 年 5 月 22 日

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

Toshinori Kitamura,Tadashi Kozuno,Yunhao Tang,Nino Vieillard,Michal Valko,Wenhao Yang,Jincheng Mei,Pierre Ménard,Mohammad Gheshlaghi Azar,Rémi Munos,Olivier Pietquin,Matthieu Geist,Csaba Szepesvári,Wataru Kumagai,Yutaka Matsuo

from arxiv, ICML 2023 accepted

Mirror descent value iteration (MDVI), an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL), has served as the basis for recent high-performing practical RL algorithms. However, despite the use of function approximation in practice, the theoretical understanding of MDVI has been limited to tabular Markov decision processes (MDPs). We study MDVI with linear function approximation through its sample complexity required to identify an $\varepsilon$-optimal policy with probability $1-\delta$ under the settings of an infinite-horizon linear MDP, generative model, and G-optimal design. We demonstrate that least-squares regression weighted by the variance of an estimated optimal value function of the next state is crucial to achieving minimax optimality. Based on this observation, we present Variance-Weighted Least-Squares MDVI (VWLS-MDVI), the first theoretical algorithm that achieves nearly minimax optimal sample complexity for infinite-horizon linear MDPs. Furthermore, we propose a practical VWLS algorithm for value-based deep RL, Deep Variance Weighting (DVW). Our experiments demonstrate that DVW improves the performance of popular value-based deep RL algorithms on a set of MinAtar benchmarks.

Boosting（一種模型訓練加速方式） · 優化器 · 線性回歸 · 線性的 · 閾值 ·

2023 年 5 月 20 日

Improved Differentially Private Regression via Gradient Boosting

Shuai Tang,Sergul Aydore,Michael Kearns,Saeyoung Rho,Aaron Roth,Yichen Wang,Yu-Xiang Wang,Zhiwei Steven Wu

We revisit the problem of differentially private squared error linear regression. We observe that existing state-of-the-art methods are sensitive to the choice of hyperparameters -- including the ``clipping threshold'' that cannot be set optimally in a data-independent way. We give a new algorithm for private linear regression based on gradient boosting. We show that our method consistently improves over the previous state of the art when the clipping threshold is taken to be fixed without knowledge of the data, rather than optimized in a non-private way -- and that even when we optimize the hyperparameters of competitor algorithms non-privately, our algorithm is no worse and often better. In addition to a comprehensive set of experiments, we give theoretical insights to explain this behavior.

隨機梯度下降 · 損失 · 噪聲 · Analysis · SGD ·

2023 年 5 月 20 日

Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

Lingjiong Zhu,Mert Gurbuzbalaban,Anant Raj,Umut Simsekli

from arxiv, 47 pages

Algorithmic stability is an important notion that has proven powerful for deriving generalization bounds for practical algorithms. The last decade has witnessed an increasing number of stability bounds for different algorithms applied on different classes of loss functions. While these bounds have illuminated various properties of optimization algorithms, the analysis of each case typically required a different proof technique with significantly different mathematical tools. In this study, we make a novel connection between learning theory and applied probability and introduce a unified guideline for proving Wasserstein stability bounds for stochastic optimization algorithms. We illustrate our approach on stochastic gradient descent (SGD) and we obtain time-uniform stability bounds (i.e., the bound does not increase with the number of iterations) for strongly convex losses and non-convex losses with additive noise, where we recover similar results to the prior art or extend them to more general cases by using a single proof technique. Our approach is flexible and can be generalizable to other popular optimizers, as it mainly requires developing Lyapunov functions, which are often readily available in the literature. It also illustrates that ergodicity is an important component for obtaining time-uniform bounds -- which might not be achieved for convex or non-convex losses unless additional noise is injected to the iterates. Finally, we slightly stretch our analysis technique and prove time-uniform bounds for SGD under convex and non-convex losses (without additional additive noise), which, to our knowledge, is novel.

點云 · 參數空間 · Projection · 可約的 · Extensibility ·

2023 年 5 月 19 日

Efficient and Deterministic Search Strategy Based on Residual Projections for Point Cloud Registration

Xinyi Li,Yinlong Liu,Hu Cao,Xueli Liu,Feihu Zhang,Alois Knoll

Estimating the rigid transformation between two LiDAR scans through putative 3D correspondences is a typical point cloud registration paradigm. Current 3D feature matching approaches commonly lead to numerous outlier correspondences, making outlier-robust registration techniques indispensable. Many recent studies have adopted the branch and bound (BnB) optimization framework to solve the correspondence-based point cloud registration problem globally and deterministically. Nonetheless, BnB-based methods are time-consuming to search the entire 6-dimensional parameter space, since their computational complexity is exponential to the dimension of the solution domain. In order to enhance algorithm efficiency, existing works attempt to decouple the 6 degrees of freedom (DOF) original problem into two 3-DOF sub-problems, thereby reducing the dimension of the parameter space. In contrast, our proposed approach introduces a novel pose decoupling strategy based on residual projections, effectively decomposing the raw problem into three 2-DOF rotation search sub-problems. Subsequently, we employ a novel BnB-based search method to solve these sub-problems, achieving efficient and deterministic registration. Furthermore, our method can be adapted to address the challenging problem of simultaneous pose and correspondence registration (SPCR). Through extensive experiments conducted on synthetic and real-world datasets, we demonstrate that our proposed method outperforms state-of-the-art methods in terms of efficiency, while simultaneously ensuring robustness.

Networking · 線性的 · 正則化項 · Neural Networks · state-of-the-art ·

2023 年 5 月 19 日

A Compound Gaussian Network for Solving Linear Inverse Problems

Carter Lyons,Raghu G. Raj,Margaret Cheney

from arxiv, 13 pages, 7 figures, 5 tables; references updated

For solving linear inverse problems, particularly of the type that appear in tomographic imaging and compressive sensing, this paper develops two new approaches. The first approach is an iterative algorithm that minimizers a regularized least squares objective function where the regularization is based on a compound Gaussian prior distribution. The Compound Gaussian prior subsumes many of the commonly used priors in image reconstruction, including those of sparsity-based approaches. The developed iterative algorithm gives rise to the paper's second new approach, which is a deep neural network that corresponds to an "unrolling" or "unfolding" of the iterative algorithm. Unrolled deep neural networks have interpretable layers and outperform standard deep learning methods. This paper includes a detailed computational theory that provides insight into the construction and performance of both algorithms. The conclusion is that both algorithms outperform other state-of-the-art approaches to tomographic image formation and compressive sensing, especially in the difficult regime of low training.