亚洲精品无码黄色网站在线观看_啊灬啊灬啊灬快灬深用两性_国产精品免费aⅴ片在线观看_九月婷婷人人澡人人添人人_中文人妻无码一区二区三区视频_黄网站色视频免费观看无下载一区_伊人女人免费黄网

We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergence between a moving source and a fixed target distribution. This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions. When using a Reproducing Kernel Hilbert Space (RKHS) to define the function class, we show that the KALE continuously interpolates between the KL and the Maximum Mean Discrepancy (MMD). Like the MMD and other Integral Probability Metrics, the KALE remains well defined for mutually singular distributions. Nonetheless, the KALE inherits from the limiting KL a greater sensitivity to mismatch in the support of the distributions, compared with the MMD. These two properties make the KALE gradient flow particularly well suited when the target distribution is supported on a low-dimensional manifold. Under an assumption of sufficient smoothness of the trajectories, we show the global convergence of the KALE flow. We propose a particle implementation of the flow given initial samples from the source and the target distribution, which we use to empirically confirm the KALE's properties.

相關內容

最大平均偏差

關注 1

近似 · CASE · Performer · CASES · 優化器 ·

2021 年 12 月 30 日

DPG methods for a fourth-order div problem

Thomas Führer,Pablo Herrera,Norbert Heuer

from arxiv, Supported by ANID-Chile through FONDECYT projects 1190009, 1210391

We study a fourth-order div problem and its approximation by the discontinuous Petrov-Galerkin method with optimal test functions. We present two variants, based on first and second-order systems. In both cases we prove well-posedness of the formulation and quasi-optimal convergence of the approximation. Our analysis includes the fully-discrete schemes with approximated test functions, for general dimension and polynomial degree in the first-order case, and for two dimensions and lowest-order approximation in the second-order case. Numerical results illustrate the performance for quasi-uniform and adaptively refined meshes.

INFORMS · 動力系統 · 近似 · 估計/估計量 · MoDELS ·

2021 年 12 月 29 日

Lagrangian uncertainty quantification and information inequalities for stochastic flows

Michal Branicki,Kenneth Uda

We develop a systematic information-theoretic framework for quantification and mitigation of error in probabilistic Lagrangian (i.e., path-based) predictions which are obtained from dynamical systems generated by uncertain (Eulerian) vector fields. This work is motivated by the desire to improve Lagrangian predictions in complex dynamical systems based either on analytically simplified or data-driven models. We derive a hierarchy of general information bounds on uncertainty in estimates of statistical observables $\mathbb{E}^{\nu}[f]$, evaluated on trajectories of the approximating dynamical system, relative to the "true'' observables $\mathbb{E}^{\mu}[f]$ in terms of certain $\varphi$-divergences, $\mathcal{D}_\varphi(\mu\|\nu)$, which quantify discrepancies between probability measures $\mu$ associated with the original dynamics and their approximations $\nu$. We then derive two distinct bounds on $\mathcal{D}_\varphi(\mu\|\nu)$ itself in terms of the Eulerian fields. This new framework provides a rigorous way for quantifying and mitigating uncertainty in Lagrangian predictions due to Eulerian model error.

正則的 · 相關系數 · 非凸 · 典型相關分析 · 隨機梯度下降 ·

2021 年 12 月 29 日

Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems

Chris Junchi Li,Michael I. Jordan

Motivated by the problem of online canonical correlation analysis, we propose the \emph{Stochastic Scaled-Gradient Descent} (SSGD) algorithm for minimizing the expectation of a stochastic function over a generic Riemannian manifold. SSGD generalizes the idea of projected stochastic gradient descent and allows the use of scaled stochastic gradients instead of stochastic gradients. In the special case of a spherical constraint, which arises in generalized eigenvector problems, we establish a nonasymptotic finite-sample bound of $\sqrt{1/T}$, and show that this rate is minimax optimal, up to a polylogarithmic factor of relevant parameters. On the asymptotic side, a novel trajectory-averaging argument allows us to achieve local asymptotic normality with a rate that matches that of Ruppert-Polyak-Juditsky averaging. We bring these ideas together in an application to online canonical correlation analysis, deriving, for the first time in the literature, an optimal one-time-scale algorithm with an explicit rate of local asymptotic convergence to normality. Numerical studies of canonical correlation analysis are also provided for synthetic data.

離散化 · 類別 · CASES · 估計/估計量 · 優化器 ·

2021 年 12 月 29 日

Error analysis of an unfitted HDG method for a class of non-linear elliptic problems

Nestor Sánchez,Tonatiuh Sánchez-Vizuet,Manuel E. Solano

We study Hibridizable Discontinuous Galerkin (HDG) discretizations for a class of non-linear interior elliptic boundary value problems posed in curved domains where both the source term and the diffusion coefficient are non-linear. We consider the cases where the non-linear diffusion coefficient depends on the solution and on the gradient of the solution. To sidestep the need for curved elements, the discrete solution is computed on a polygonal subdomain that is not assumed to interpolate the true boundary, giving rise to an unfitted computational mesh. We show that, under mild assumptions on the source term and the computational domain, the discrete systems are well posed. Furthermore, we provide a priori error estimates showing that the discrete solution will have optimal order of convergence as long as the distance between the curved boundary and the computational boundary remains of the same order of magnitude as the mesh parameter.

MCMC · 估計/估計量 · 采樣法 · 馬爾可夫鏈 · 近似 ·

2021 年 12 月 29 日

Bounding Wasserstein distance with couplings

Niloy Biswas,Lester Mackey

from arxiv, 50 pages, 8 figures

Markov chain Monte Carlo (MCMC) provides asymptotically consistent estimates of intractable posterior expectations as the number of iterations tends to infinity. However, in large data applications, MCMC can be computationally expensive per iteration. This has catalyzed interest in sampling methods such as approximate MCMC, which trade off asymptotic consistency for improved computational speed. In this article, we propose estimators based on couplings of Markov chains to assess the quality of such asymptotically biased sampling methods. The estimators give empirical upper bounds of the Wassertein distance between the limiting distribution of the asymptotically biased sampling method and the original target distribution of interest. We establish theoretical guarantees for our upper bounds and show that our estimators can remain effective in high dimensions. We apply our quality measures to stochastic gradient MCMC, variational Bayes, and Laplace approximations for tall data and to approximate MCMC for Bayesian logistic regression in 4500 dimensions and Bayesian linear regression in 50000 dimensions.

估計/估計量 · 可約的 · MoDELS · 試驗 · 共線性 ·

2021 年 12 月 28 日

Analysis of N-of-1 trials using Bayesian distributed lag model with autocorrelated errors

Ziwei Liao,Min Qian,Ian M. Kronish,Ying Kuen Cheung

An N-of-1 trial is a multi-period crossover trial performed in a single individual, with a primary goal to estimate treatment effect on the individual instead of population-level mean responses. As in a conventional crossover trial, it is critical to understand carryover effects of the treatment in an N-of-1 trial, especially when no washout periods between treatment periods are instituted to reduce trial duration. To deal with this issue in situations where high volume of measurements is made during the study, we introduce a novel Bayesian distributed lag model that facilitates the estimation of carryover effects, while accounting for temporal correlations using an autoregressive model. Specifically, we propose a prior variance-covariance structure on the lag coefficients to address collinearity caused by the fact that treatment exposures are typically identical on successive days. A connection between the proposed Bayesian model and penalized regression is noted. Simulation results demonstrate that the proposed model substantially reduces the root mean squared error in the estimation of carryover effects and immediate effects when compared to other existing methods, while being comparable in the estimation of the total effects. We also apply the proposed method to assess the extent of carryover effects of light therapies in relieving depressive symptoms in cancer survivors.

離散化 · MoDELS · 樣本 · 似然 · 近似 ·

2021 年 6 月 6 日

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Will Grathwohl,Kevin Swersky,Milad Hashemi,David Duvenaud,Chris J. Maddison

from arxiv, Energy-Based Models, Deep generative models, MCMC sampling

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

優化器 · Lipschitz連續 · 正則化項 · Continuity · Lipschitz ·

2018 年 6 月 1 日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Kevin Scaman,Francis Bach,Sébastien Bubeck,Yin Tat Lee,Laurent Massoulié

from arxiv, 17 pages

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

最大平均偏差 · 優化器 · Performer · CASES · tuning ·

2018 年 1 月 30 日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Ben Usman,Kate Saenko,Brian Kulis

from arxiv, ICLR 2018 Conference Invite to Workshop

Methods that align distributions by minimizing an adversarial distance between them have recently achieved impressive results. However, these approaches are difficult to optimize with gradient descent and they often do not converge well without careful hyperparameter tuning and proper initialization. We investigate whether turning the adversarial min-max problem into an optimization problem by replacing the maximization part with its dual improves the quality of the resulting alignment and explore its connections to Maximum Mean Discrepancy. Our empirical results suggest that using the dual formulation for the restricted family of linear discriminators results in a more stable convergence to a desirable solution when compared with the performance of a primal min-max GAN-like objective and an MMD objective under the same restrictions. We test our hypothesis on the problem of aligning two synthetic point clouds on a plane and on a real-image domain adaptation problem on digits. In both cases, the dual formulation yields an iterative procedure that gives more stable and monotonic improvement over time.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.