国产一本二本三本的区别视频_国产一级视频在线高清播放_人人婷婷色综合五月第四人色阁_美女免费网站一区二区三区_在线午夜国产视频_又粗又猛又大又硬又爽视频_国产美女在线高清免费观看

Much of the literature on optimal design of bandit algorithms is based on minimization of expected regret. It is well known that designs that are optimal over certain exponential families can achieve expected regret that grows logarithmically in the number of arm plays, at a rate governed by the Lai-Robbins lower bound. In this paper, we show that when one uses such optimized designs, the regret distribution of the associated algorithms necessarily has a very heavy tail, specifically, that of a truncated Cauchy distribution. Furthermore, for $p>1$, the $p$'th moment of the regret distribution grows much faster than poly-logarithmically, in particular as a power of the total number of arm plays. We show that optimized UCB bandit designs are also fragile in an additional sense, namely when the problem is even slightly mis-specified, the regret can grow much faster than the conventional theory suggests. Our arguments are based on standard change-of-measure ideas, and indicate that the most likely way that regret becomes larger than expected is when the optimal arm returns below-average rewards in the first few arm plays, thereby causing the algorithm to believe that the arm is sub-optimal. To alleviate the fragility issues exposed, we show that UCB algorithms can be modified so as to ensure a desired degree of robustness to mis-specification. In doing so, we also provide a sharp trade-off between the amount of UCB exploration and the tail exponent of the resulting regret distribution.

相關內容

賭博機/老虎機

關注 0

MoDELS · Learning · Performer · 優化器 · 矩匹配 ·

2023 年 3 月 1 日

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

Anirudh Vemula,Yuda Song,Aarti Singh,J. Andrew Bagnell,Sanjiban Choudhury

We propose a novel approach to addressing two fundamental challenges in Model-based Reinforcement Learning (MBRL): the computational expense of repeatedly finding a good policy in the learned model, and the objective mismatch between model fitting and policy computation. Our "lazy" method leverages a novel unified objective, Performance Difference via Advantage in Model, to capture the performance difference between the learned policy and expert policy under the true dynamics. This objective demonstrates that optimizing the expected policy advantage in the learned model under an exploration distribution is sufficient for policy computation, resulting in a significant boost in computational efficiency compared to traditional planning methods. Additionally, the unified objective uses a value moment matching term for model fitting, which is aligned with the model's usage during policy computation. We present two no-regret algorithms to optimize the proposed objective, and demonstrate their statistical and computational gains compared to existing MBRL methods through simulated benchmarks.

秩 · 值域 · 得分 · 論文 · 社會計算 ·

2023 年 2 月 28 日

Ranked Choice Voting And the Center Squeeze in the Alaska 2022 Special Election: How Might Other Voting Methods Compare?

Jeanne N. Clelland

from arxiv, 12 pages, 8 tables, 3 figures

The August 2022 special election for U.S. House Representative in Alaska featured three main candidates and was conducted by by single-winner ranked choice voting method known as ``instant runoff voting." The results of this election displayed a well-known but relatively rare phenomenon known as the ``center squeeze:" The most centrist candidate, Mark Begich, was eliminated in the first round despite winning an overwhelming majority of second-place votes. In fact, Begich was the {\em Condorcet winner} of this election: Based on the cast vote record, he would have defeated both of the other two candidates in head-to-head contests, but he was eliminated in the first round of ballot counting due to receiving the fewest first-place votes. The purpose of this paper is to use the data in the cast vote record to explore the range of likely outcomes if this election had been conducted under two alternative voting methods: Approval Voting and STAR (``Score Then Automatic Runoff") Voting. We find that under the best assumptions available about voter behavior, the most likely outcomes are that Peltola would still have won the election under Approval Voting, while Begich would have won under STAR Voting.

優化器 · 設計 · Processing（編程語言） · ML · 可約的 ·

2023 年 2 月 28 日

Fusion of ML with numerical simulation for optimized propeller design

Harsh Vardhan,Peter Volgyesi,Janos Sztipanovits

In computer-aided engineering design, the goal of a designer is to find an optimal design on a given requirement using the numerical simulator in loop with an optimization method. In this design optimization process, a good design optimization process is one that can reduce the time from inception to design. In this work, we take a class of design problem, that is computationally cheap to evaluate but has high dimensional design space. In such cases, traditional surrogate-based optimization does not offer any benefits. In this work, we propose an alternative way to use ML model to surrogate the design process that formulates the search problem as an inverse problem and can save time by finding the optimal design or at least a good initial seed design for optimization. By using this trained surrogate model with the traditional optimization method, we can get the best of both worlds. We call this as Surrogate Assisted Optimization (SAO)- a hybrid approach by mixing ML surrogate with the traditional optimization method. Empirical evaluations of propeller design problems show that a better efficient design can be found in fewer evaluations using SAO.

規范化的 · 離散化 · Continuity · 估計/估計量 · 流形 ·

2023 年 2 月 28 日

Transport map unadjusted Langevin algorithms

Benjamin J. Zhang,Youssef M. Marzouk,Konstantinos Spiliopoulos

from arxiv, 30 pages, 10 figures

Langevin dynamics are widely used in sampling high-dimensional, non-Gaussian distributions whose densities are known up to a normalizing constant. In particular, there is strong interest in unadjusted Langevin algorithms (ULA), which directly discretize Langevin dynamics to estimate expectations over the target distribution. We study the use of transport maps that approximately normalize a target distribution as a way to precondition and accelerate the convergence of Langevin dynamics. We show that in continuous time, when a transport map is applied to Langevin dynamics, the result is a Riemannian manifold Langevin dynamics (RMLD) with metric defined by the transport map. This connection suggests more systematic ways of learning metrics, and also yields alternative discretizations of the RMLD described by the map, which we study. Moreover, we show that under certain conditions, when the transport map is used in conjunction with ULA, we can improve the geometric rate of convergence of the output process in the 2--Wasserstein distance. Illustrative numerical results complement our theoretical claims.

穩健性 · 可辨認的 · Extensibility · 估計/估計量 · 離散化 ·

2023 年 2 月 27 日

A polytopal method for the Brinkman problem robust in all regimes

Daniele A. Di Pietro,Jér?me Droniou

In this work we develop a discretisation method for the Brinkman problem that is uniformly well-behaved in all regimes (as identified by a local dimensionless number with the meaning of a friction coefficient) and supports general meshes as well as arbitrary approximation orders. The method is obtained combining ideas from the Hybrid High-Order and Discrete de Rham methods, and its robustness rests on a potential reconstruction and stabilisation terms that change in nature according to the value of the local friction coefficient. We derive error estimates that, thanks to the presence of cut-off factors, are valid across the all regimes and provide extensive numerical validation.

優化器 · Analysis · 標量 · Continuity · 離散化 ·

2023 年 2 月 27 日

On the Calculation of the Brinkman Penalization Term in Density-Based Topology Optimization of Fluid-Dependent Problems

Mohamed Abdelhamid,Aleksander Czekanski

In topology optimization of fluid-dependent problems, there is a need to interpolate within the design domain between fluid and solid in a continuous fashion. In density-based methods, the concept of inverse permeability in the form of a volumetric force is utilized to enforce zero fluid velocity in non-fluid regions. This volumetric force consists of a scalar term multiplied by the fluid velocity. This scalar term takes a value between two limits as determined by a convex interpolation function. The maximum inverse permeability limit is typically chosen through a trial and error analysis of the initial form of the optimization problem; such that the fields resolved resemble those obtained through an analysis of a pure fluid domain with a body-fitted mesh. In this work, we investigate the dependency of the maximum inverse permeability limit on the mesh size and the flow conditions through analyzing the Navier-Stokes equation in its strong as well as discretized finite element forms. We use numerical experiments to verify and characterize these dependencies.

情景 · 在線 · 優化器 · CASE · 平滑 ·

2023 年 2 月 27 日

Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime

Hilal Asi,Vitaly Feldman,Tomer Koren,Kunal Talwar

We consider online learning problems in the realizable setting, where there is a zero-loss solution, and propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds. For the problem of online prediction from experts, we design new algorithms that obtain near-optimal regret ${O} \big( \varepsilon^{-1} \log^{1.5}{d} \big)$ where $d$ is the number of experts. This significantly improves over the best existing regret bounds for the DP non-realizable setting which are ${O} \big( \varepsilon^{-1} \min\big\{d, T^{1/3}\log d\big\} \big)$. We also develop an adaptive algorithm for the small-loss setting with regret $O(L^\star\log d + \varepsilon^{-1} \log^{1.5}{d})$ where $L^\star$ is the total loss of the best expert. Additionally, we consider DP online convex optimization in the realizable setting and propose an algorithm with near-optimal regret $O \big(\varepsilon^{-1} d^{1.5} \big)$, as well as an algorithm for the smooth case with regret $O \big( \varepsilon^{-2/3} (dT)^{1/3} \big)$, both significantly improving over existing bounds in the non-realizable regime.

Analysis · 泛函 · 可約的 · 漢明距離 · 查準率/準確率 ·

2023 年 2 月 27 日

Runtime Analysis for Permutation-based Evolutionary Algorithms

Benjamin Doerr,Yassine Ghannane,Marouane Ibn Brahim

from arxiv, Journal version of our paper at GECCO 2022. 51 pages. arXiv admin note: substantial text overlap with arXiv:2204.07637

While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the LeadingOnes and Jump benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $\sigma$ into another one $\tau$, but also the precise cycle structure of $\sigma \tau^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $\Theta(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{\Theta(m)}$ on jump functions with jump size $m$. A short empirical analysis confirms these findings, but also reveals that small implementation details like the rate of void mutations can make an important difference.

近似 · Performer · Extensibility · 相互獨立的 · 近似誤差 ·

2023 年 2 月 24 日

Randomized low-rank approximation of parameter-dependent matrices

Daniel Kressner,Hei Yin Lam

This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to $A(t)$ would involve different, independent DRMs for every $t$, which is not only expensive but also leads to inherently non-smooth approximations. In this work, we propose to use constant DRMs, that is, $A(t)$ is multiplied with the same DRM for every $t$. The resulting parameter-dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nystr\"{o}m method, are computationally attractive, especially when $A(t)$ admits an affine linear decomposition with respect to $t$. We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the $L^2$ approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi-best low-rank approximations.

Conformer · 離散化 · 試驗 · Integration · 可理解性 ·

2023 年 2 月 24 日

Numerical study of conforming space-time methods for Maxwell's equations

Julia I. M. Hauser,Marco Zank

Time-dependent Maxwell's equations govern electromagnetics. Under certain conditions, we can rewrite these equations into a partial differential equation of second order, which in this case is the vectorial wave equation. For the vectorial wave, we investigate the numerical application and the challenges in the implementation. For this purpose, we consider a space-time variational setting, i.e. time is just another spatial dimension. More specifically, we apply integration by parts in time as well as in space, leading to a space-time variational formulation with different trial and test spaces. Conforming discretizations of tensor-product type result in a Galerkin--Petrov finite element method that requires a CFL condition for stability. For this Galerkin--Petrov variational formulation, we study the CFL condition and its sharpness. To overcome the CFL condition, we use a Hilbert-type transformation that leads to a variational formulation with equal trial and test spaces. Conforming space-time discretizations result in a new Galerkin--Bubnov finite element method that is unconditionally stable. In numerical examples, we demonstrate the effectiveness of this Galerkin--Bubnov finite element method. Furthermore, we investigate different projections of the right-hand side and their influence on the convergence rates. This paper is the first step towards a more stable computation and a better understanding of vectorial wave equations in a conforming space-time approach.