男男网站网址视频免费观看_国产三级片视频欣赏_精品人妻一乃葵AV中出_久久人人添人人爽人人妻精品_国产女在线调教视频_亚洲AV无码国产毛片久久动漫_国产成人精品1024免费下载

Computational optimal transport (OT) has recently emerged as a powerful framework with applications in various fields. In this paper we focus on a relaxation of the original OT problem, the entropic OT problem, which allows to implement efficient and practical algorithmic solutions, even in high dimensional settings. This formulation, also known as the Schr\"odinger Bridge problem, notably connects with Stochastic Optimal Control (SOC) and can be solved with the popular Sinkhorn algorithm. In the case of discrete-state spaces, this algorithm is known to have exponential convergence; however, achieving a similar rate of convergence in a more general setting is still an active area of research. In this work, we analyze the convergence of the Sinkhorn algorithm for probability measures defined on the $d$-dimensional torus $\mathbb{T}_L^d$, that admit densities with respect to the Haar measure of $\mathbb{T}_L^d$. In particular, we prove pointwise exponential convergence of Sinkhorn iterates and their gradient. Our proof relies on the connection between these iterates and the evolution along the Hamilton-Jacobi-Bellman equations of value functions obtained from SOC-problems. Our approach is novel in that it is purely probabilistic and relies on coupling by reflection techniques for controlled diffusions on the torus.

相關內容

耦合

關注 0

控制器 · 推斷 · 回合 · Weight · 可行 ·

2023 年 5 月 31 日

Hierarchical Policy Blending as Inference for Reactive Robot Control

Kay Hansel,Julen Urain,Jan Peters,Georgia Chalvatzaki

from arxiv, 8 pages, 5 figures, 1 table, accepted at ICRA 2023

Motion generation in cluttered, dense, and dynamic environments is a central topic in robotics, rendered as a multi-objective decision-making problem. Current approaches trade-off between safety and performance. On the one hand, reactive policies guarantee fast response to environmental changes at the risk of suboptimal behavior. On the other hand, planning-based motion generation provides feasible trajectories, but the high computational cost may limit the control frequency and thus safety. To combine the benefits of reactive policies and planning, we propose a hierarchical motion generation method. Moreover, we adopt probabilistic inference methods to formalize the hierarchical model and stochastic optimization. We realize this approach as a weighted product of stochastic, reactive expert policies, where planning is used to adaptively compute the optimal weights over the task horizon. This stochastic optimization avoids local optima and proposes feasible reactive plans that find paths in cluttered and dense environments. Our extensive experimental study in planar navigation and 6DoF manipulation shows that our proposed hierarchical motion generation method outperforms both myopic reactive controllers and online re-planning methods.

線性的 · 確定性策略 · 情景 · 計算學習理論 · 機器學習 ·

2023 年 5 月 31 日

On the Linear Convergence of Policy Gradient under Hadamard Parameterization

Jiacai Liu,Jinchi Chen,Ke Wei

The convergence of deterministic policy gradient under the Hadamard parametrization is studied in the tabular setting and the global linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_0$ iterations, where $k_0$ is a constant that only depends on the MDP problem and the step size. Overall, the algorithm displays a linear convergence rate for all the iterations with a loose constant than that for the local linear convergence rate.

估計/估計量 · 線性的 · 樣例 · 可行 · 知識 (knowledge) ·

2023 年 5 月 31 日

The error and perturbation bounds for the absolute value equations with some applications

Shi-Liang Wu,Cui-Xia Li

In this paper, inspired by in the pervious published work in [Math. Program., 198 (2023), pp. 85-113] by Zamani and Hlad\'{\i}k, we focus on the error and perturbation bounds for the general absolute value equations because so far, to our knowledge, the error and perturbation bounds for the general absolute value equations are not discussed. In order to fill in this study gap, in this paper, by introducing a class of absolute value functions, we study the error bounds and perturbation bounds for two types of absolute value equations (AVEs): $Ax-B|x|=b$ and $Ax-|Bx|=b$. Some useful error bounds and perturbation bounds for the above two types of absolute value equations are presented. By applying the absolute value equations, we also obtain the error and perturbation bounds for the horizontal linear complementarity problem (HLCP). In addition, a new perturbation bound for the LCP without constraint conditions is given as well, which are weaker than the presented work in [SIAM J. Optim., 2007, 18: 1250-1265] in a way. Besides, without limiting the matrix type, some computable estimates for the above upper bounds are given, which are sharper than some existing results under certain conditions. Some numerical examples for the AVEs from the LCP are given to show the feasibility of the perturbation bounds.

樣本復雜度 · 線性的 · 策略評估 · 泛函 · 樣本 ·

2023 年 5 月 30 日

Sharp high-probability sample complexities for policy evaluation with linear function approximation

Gen Li,Weichen Wu,Yuejie Chi,Cong Ma,Alessandro Rinaldo,Yuting Wei

from arxiv, The first two authors contributed equally

This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms: the temporal difference (TD) learning algorithm and the two-timescale linear TD with gradient correction (TDC) algorithm. In both the on-policy setting, where observations are generated from the target policy, and the off-policy setting, where samples are drawn from a behavior policy potentially different from the target policy, we establish the first sample complexity bound with high-probability convergence guarantee that attains the optimal dependence on the tolerance level. We also exhihit an explicit dependence on problem-related quantities, and show in the on-policy setting that our upper bound matches the minimax lower bound on crucial problem parameters, including the choice of the feature maps and the problem dimension.

優化器 · 預測器/決策函數 · 損失 · 可約的 · 全局優化 ·

2023 年 5 月 30 日

When Does Optimizing a Proper Loss Yield Calibration?

Jaros?aw B?asiok,Parikshit Gopalan,Lunjia Hu,Preetum Nakkiran

Optimizing proper loss functions is popularly believed to yield predictors with good calibration properties; the intuition being that for such losses, the global optimum is to predict the ground-truth probabilities, which is indeed calibrated. However, typical machine learning models are trained to approximately minimize loss over restricted families of predictors, that are unlikely to contain the ground truth. Under what circumstances does optimizing proper loss over a restricted family yield calibrated models? What precise calibration guarantees does it give? In this work, we provide a rigorous answer to these questions. We replace the global optimality with a local optimality condition stipulating that the (proper) loss of the predictor cannot be reduced much by post-processing its predictions with a certain family of Lipschitz functions. We show that any predictor with this local optimality satisfies smooth calibration as defined in Kakade-Foster (2008), B{\l}asiok et al. (2023). Local optimality is plausibly satisfied by well-trained DNNs, which suggests an explanation for why they are calibrated from proper loss minimization alone. Finally, we show that the connection between local optimality and calibration error goes both ways: nearly calibrated predictors are also nearly locally optimal.

可約的 · binary · INTERACT · MoDELS · Performer ·

2023 年 5 月 29 日

Global high-order numerical schemes for the time evolution of the general relativistic radiation magneto-hydrodynamics equations

Manuel R. Izquierdo,Lorenzo Pareschi,Borja Mi?ano,Joan Massó,Carlos Palenzuela

from arxiv, 18 + 6. Updated manuscript matching published version + additional appendix "Comparing the convergence order of IMEX and semi-implicit schemes"

Modeling correctly the transport of neutrinos is crucial in some astrophysical scenarios such as core-collapse supernovae and binary neutron star mergers. In this paper, we focus on the truncated-moment formalism, considering only the first two moments (M1 scheme) within the grey approximation, which reduces Boltzmann seven-dimensional equation to a system of $3+1$ equations closely resembling the hydrodynamic ones. Solving the M1 scheme is still mathematically challenging, since it is necessary to model the radiation-matter interaction in regimes where the evolution equations become stiff and behave as an advection-diffusion problem. Here, we present different global, high-order time integration schemes based on Implicit-Explicit Runge-Kutta (IMEX) methods designed to overcome the time-step restriction caused by such behavior while allowing us to use the explicit RK commonly employed for the MHD and Einstein equations. Finally, we analyze their performance in several numerical tests.

INTERACT · Lipschitz · 近似 · Analysis · 可約的 ·

2023 年 5 月 29 日

Convergence analysis of an explicit method and its random batch approximation for the McKean-Vlasov equations with non-globally Lipschitz conditions

Qian Guo,Jie He,Lei Li

In this paper, we present a numerical approach to solve the McKean-Vlasov equations, which are distribution-dependent stochastic differential equations, under some non-globally Lipschitz conditions for both the drift and diffusion coefficients. We establish a propagation of chaos result, based on which the McKean-Vlasov equation is approximated by an interacting particle system. A truncated Euler scheme is then proposed for the interacting particle system allowing for a Khasminskii-type condition on the coefficients. To reduce the computational cost, the random batch approximation proposed in [Jin et al., J. Comput. Phys., 400(1), 2020] is extended to the interacting particle system where the interaction could take place in the diffusion term. An almost half order of convergence is proved in $L^p$ sense. Numerical tests are performed to verify the theoretical results.

估計/估計量 · 頻率主義學派 · state-of-the-art · Performer · 正則的 ·

2023 年 5 月 29 日

A nonparametric regression alternative to empirical Bayes approaches to simultaneous estimation

Alton Barbehenn,Sihai Dave Zhao

The simultaneous estimation of multiple unknown parameters lies at heart of a broad class of important problems across science and technology. Currently, the state-of-the-art performance in the such problems is achieved by nonparametric empirical Bayes methods. However, these approaches still suffer from two major issues. First, they solve a frequentist problem but do so by following Bayesian reasoning, posing a philosophical dilemma that has contributed to somewhat uneasy attitudes toward empirical Bayes methodology. Second, their computation relies on certain density estimates that become extremely unreliable in some complex simultaneous estimation problems. In this paper, we study these issues in the context of the canonical Gaussian sequence problem. We propose an entirely frequentist alternative to nonparametric empirical Bayes methods by establishing a connection between simultaneous estimation and penalized nonparametric regression. We use flexible regularization strategies, such as shape constraints, to derive accurate estimators without appealing to Bayesian arguments. We prove that our estimators achieve asymptotically optimal regret and show that they are competitive with or can outperform nonparametric empirical Bayes methods in simulations and an analysis of spatially resolved gene expression data.

近似 · 經驗風險最小化 · 經驗風險 · 早停 · TOOLS ·

2023 年 5 月 27 日

Iterative Approximate Cross-Validation

Yuetian Luo,Zhimei Ren,Rina Foygel Barber

Cross-validation (CV) is one of the most popular tools for assessing and selecting predictive models. However, standard CV suffers from high computational cost when the number of folds is large. Recently, under the empirical risk minimization (ERM) framework, a line of works proposed efficient methods to approximate CV based on the solution of the ERM problem trained on the full dataset. However, in large-scale problems, it can be hard to obtain the exact solution of the ERM problem, either due to limited computational resources or due to early stopping as a way of preventing overfitting. In this paper, we propose a new paradigm to efficiently approximate CV when the ERM problem is solved via an iterative first-order algorithm, without running until convergence. Our new method extends existing guarantees for CV approximation to hold along the whole trajectory of the algorithm, including at convergence, thus generalizing existing CV approximation methods. Finally, we illustrate the accuracy and computational efficiency of our method through a range of empirical studies.

字典學習 · Learning · MoDELS · 有向 · 近似 ·

2023 年 5 月 26 日

Convergence of alternating minimisation algorithms for dictionary learning

Simon Ruetz,Karin Schnass

In this paper we derive sufficient conditions for the convergence of two popular alternating minimisation algorithms for dictionary learning - the Method of Optimal Directions (MOD) and Online Dictionary Learning (ODL), which can also be thought of as approximative K-SVD. We show that given a well-behaved initialisation that is either within distance at most $1/\log(K)$ to the generating dictionary or has a special structure ensuring that each element of the initialisation only points to one generating element, both algorithms will converge with geometric convergence rate to the generating dictionary. This is done even for data models with non-uniform distributions on the supports of the sparse coefficients. These allow the appearance frequency of the dictionary elements to vary heavily and thus model real data more closely.