午夜剧场成年免费视,999精品视频在线免费观看

For the misspecified linear Markov decision process (MLMDP) model of Jin et al. [2020], we propose an algorithm with three desirable properties. (P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance. (P2) Its space and per-episode time complexities remain bounded as $K \rightarrow \infty$. (P3) It does not require $\varepsilon_{\text{mis}}$ as input. To our knowledge, this is the first algorithm satisfying all three properties. For concrete choices of $\varepsilon_{\text{tol}}$, we also improve existing regret bounds (up to log factors) while achieving either (P2) or (P3) (existing algorithms satisfy neither). At a high level, our algorithm generalizes (to MLMDPs) and refines the Sup-Lin-UCB algorithm, which Takemura et al. [2021] recently showed satisfies (P3) for contextual bandits. We also provide an intuitive interpretation of their result, which informs the design of our algorithm.

相關內容

線性的

關注 1

估計/估計量 · 近似 · Networking · Neural Networks · 優化器 ·

2021 年 12 月 16 日

Error Estimates for the Variational Training of Neural Networks with Boundary Penalty

Johannes Müller,Marius Zeinhofer

from arxiv, 16 pages, no figures

We establish estimates on the error made by the Deep Ritz Method for elliptic problems on the space $H^1(\Omega)$ with different boundary conditions. For Dirichlet boundary conditions, we estimate the error when the boundary values are approximately enforced through the boundary penalty method. Our results apply to arbitrary and in general non linear classes $V\subseteq H^1(\Omega)$ of ansatz functions and estimate the error in dependence of the optimization accuracy, the approximation capabilities of the ansatz class and -- in the case of Dirichlet boundary values -- the penalisation strength $\lambda$. For non-essential boundary conditions the error of the Ritz method decays with the same rate as the approximation rate of the ansatz classes. For essential boundary conditions, given an approximation rate of $r$ in $H^1(\Omega)$ and an approximation rate of $s$ in $L^2(\partial\Omega)$ of the ansatz classes, the optimal decay rate of the estimated error is $\min(s/2, r)$ and achieved by choosing $\lambda_n\sim n^{s}$. We discuss the implications for ansatz classes which are given through ReLU networks and the relation to existing estimates for finite element functions.

優化器 · SCA · Performer · 設計 · 正交 ·

2021 年 12 月 16 日

Uplink Transceiver Design and Optimization for Transmissive RMS Multi-Antenna Systems

Zhendong Li,Wen Chen,Jianmin Lu,Kunlun Wang,Jun Li

In this paper, a novel uplink communication for the transmissive reconfigurable metasurface (RMS) multi-antenna system with orthogonal frequency division multiple access (OFDMA) is investigated. Specifically, a transmissive RMS-based receiver equipped with a single receiving antenna is first proposed, and a far-near field channel model based on planar waves and spherical waves is given. Then, in order to maximize the system sum-rate of uplink communications, we formulate a joint optimization problem over subcarrier allocation, power allocation and RMS transmissive coefficient design. Due to the coupling of optimization variables, the optimization problem is non-convex, so it is challenging to solve it directly. In order to tackle this problem, the alternating optimization (AO) algorithm is used to decouple the optimization variables and divide the problem into two sub-problems to solve. First, the problem of joint subcarrier allocation and power allocation is solved via the Lagrangian dual decomposition method. Then, the RMS transmissive coefficient design can be obtained by applying difference-of-convex (DC) programming, successive convex approximation (SCA) and penalty function methods. Finally, the two sub-problems are iterated alternately until convergence is achieved. Numerical simulation results verify that the proposed algorithm has good convergence performance and can improve system sum-rate compared with other benchmark algorithms.

相關系數 · 負相關法 · 在線 · 優化器 · MASS ·

2021 年 12 月 16 日

Improved Online Correlated Selection

Ruiquan Gao,Zhongtian He,Zhiyi Huang,Zipei Nie,Bijun Yuan,Yan Zhong

from arxiv, Compared to the first version, this version adds a discussion on two concurrent works on the same topic, gives a more accurate description of previous results, and improves the presentation based on the feedbacks by anonymous reviewers. The conference version appears in FOCS 2021

This paper studies the online correlated selection (OCS) problem. It was introduced by Fahrbach, Huang, Tao, and Zadimoghaddam (2020) to obtain the first edge-weighted online bipartite matching algorithm that breaks the $0.5$ barrier. Suppose that we receive a pair of elements in each round and immediately select one of them. Can we select with negative correlation to be more effective than independent random selections? Our contributions are threefold. For semi-OCS, which considers the probability that an element remains unselected after appearing in $k$ rounds, we give an optimal algorithm that minimizes this probability for all $k$. It leads to $0.536$-competitive unweighted and vertex-weighted online bipartite matching algorithms that randomize over only two options in each round, improving the $0.508$-competitive ratio by Fahrbach et al. (2020). Further, we develop the first multi-way semi-OCS that allows an arbitrary number of elements with arbitrary masses in each round. As an application, it rounds the Balance algorithm in unweighted and vertex-weighted online bipartite matching and is $0.593$-competitive. Finally, we study OCS, which further considers the probability that an element is unselected in an arbitrary subset of rounds. We prove that the optimal "level of negative correlation" is between $0.167$ and $0.25$, improving the previous bounds of $0.109$ and $1$ by Fahrbach et al. (2020). Our OCS gives a $0.519$-competitive edge-weighted online bipartite matching algorithm, improving the previous $0.508$-competitive ratio by Fahrbach et al. (2020).

優化器 · CC · 超參數 · Performer · Better ·

2021 年 12 月 15 日

Provably Faster Algorithms for Bilevel Optimization

Junjie Yang,Kaiyi Ji,Yingbin Liang

from arxiv, This paper is accepted in NeurIPS 2021

Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning. Recently, several momentum-based algorithms have been proposed to solve bilevel optimization problems faster. However, those momentum-based algorithms do not achieve provably better computational complexity than $\mathcal{\widetilde O}(\epsilon^{-2})$ of the SGD-based algorithm. In this paper, we propose two new algorithms for bilevel optimization, where the first algorithm adopts momentum-based recursive iterations, and the second algorithm adopts recursive gradient estimations in nested loops to decrease the variance. We show that both algorithms achieve the complexity of $\mathcal{\widetilde O}(\epsilon^{-1.5})$, which outperforms all existing algorithms by the order of magnitude. Our experiments validate our theoretical results and demonstrate the superior empirical performance of our algorithms in hyperparameter applications.

特化 · 優化器 · 模型評估 · 可約的 · 連續優化 ·

2021 年 12 月 15 日

On multivariate randomized classification trees: $l_0$-based sparsity, VC~dimension and decomposition methods

Edoardo Amaldi,Antonio Consolo,Andrea Manno

Decision trees are widely-used classification and regression models because of their interpretability and good accuracy. Classical methods such as CART are based on greedy approaches but a growing attention has recently been devoted to optimal decision trees. We investigate the nonlinear continuous optimization formulation proposed in Blanquero et al. (EJOR, vol. 284, 2020; COR, vol. 132, 2021) for (sparse) optimal randomized classification trees. Sparsity is important not only for feature selection but also to improve interpretability. We first consider alternative methods to sparsify such trees based on concave approximations of the $l_{0}$ ``norm". Promising results are obtained on 24 datasets in comparison with $l_1$ and $l_{\infty}$ regularizations. Then, we derive bounds on the VC dimension of multivariate randomized classification trees. Finally, since training is computationally challenging for large datasets, we propose a general decomposition scheme and an efficient version of it. Experiments on larger datasets show that the proposed decomposition method is able to significantly reduce the training times without compromising the accuracy.

Processing（編程語言） · 學成 · 賭博機/老虎機 · 學習器 · CASE ·

2021 年 12 月 15 日

Learning Adversarial Markov Decision Processes with Delayed Feedback

Tal Lancewicki,Aviv Rosenberg,Yishay Mansour

from arxiv, AAAI 2022

Reinforcement learning typically assumes that agents observe feedback for their actions immediately, but in many real-world applications (like recommendation systems) feedback is observed in delay. This paper studies online learning in episodic Markov decision processes (MDPs) with unknown transitions, adversarially changing costs and unrestricted delayed feedback. That is, the costs and trajectory of episode $k$ are revealed to the learner only in the end of episode $k + d^k$, where the delays $d^k$ are neither identical nor bounded, and are chosen by an oblivious adversary. We present novel algorithms based on policy optimization that achieve near-optimal high-probability regret of $\sqrt{K + D}$ under full-information feedback, where $K$ is the number of episodes and $D = \sum_{k} d^k$ is the total delay. Under bandit feedback, we prove similar $\sqrt{K + D}$ regret assuming the costs are stochastic, and $(K + D)^{2/3}$ regret in the general case. We are the first to consider regret minimization in the important setting of MDPs with delayed feedback.

近似 · 易處理的 · Better · 約束 · SimPLe ·

2021 年 12 月 15 日

Approximation algorithms for $k$-median with lower-bound constraints

Ameet Gadekar,Bruno Ordozgoiti,Suhas Thejaswi

We study a variant of the classical $k$-median problem known as diversity-aware $k$-median (introduced by Thejaswi et al. 2021), where we are given a collection of facility subsets, and a solution must contain at least a specified number of facilities from each subset.We investigate the fixed-parameter tractability of this problem and show several negative hardness and inapproximability results, even when we afford exponential running time with respect to some parameters of the problem. Motivated by these results we present a fixed parameter approximation algorithm with approximation ratio $(1 + \frac{2}{e} +\epsilon)$, and argue that this ratio is essentially tight assuming the gap-exponential time hypothesis. We also present a simple, practical local-search algorithm that gives a bicriteria $(2k, 3+\epsilon)$ approximation with better running time bounds.

分解的 · Better · 離散數學 ·

2021 年 12 月 15 日

An improved constant factor for the unit distance problem

Péter ágoston,D?m?t?r Pálv?lgyi

We prove that the number of unit distances among $n$ planar points is at most $1.94\cdot n^{4/3}$, improving on the previous best bound of $8n^{4/3}$. We also give better upper and lower bounds for several small values of $n$. We also prove some variants of the crossing lemma and improve some constant factors.

流 · 估計/估計量 · SODA · 隨機漫步 · PageRank ·

2021 年 12 月 14 日

Simulating Random Walks in Random Streams

John Kallaugher,Michael Kapralov,Eric Price

The random order graph streaming model has received significant attention recently, with problems such as matching size estimation, component counting, and the evaluation of bounded degree constant query testable properties shown to admit surprisingly space efficient algorithms. The main result of this paper is a space efficient single pass random order streaming algorithm for simulating nearly independent random walks that start at uniformly random vertices. We show that the distribution of $k$-step walks from $b$ vertices chosen uniformly at random can be approximated up to error $\varepsilon$ per walk using $(1/\varepsilon)^{O(k)} 2^{O(k^2)}\cdot b$ words of space with a single pass over a randomly ordered stream of edges, solving an open problem of Peng and Sohler [SODA `18]. Applications of our result include the estimation of the average return probability of the $k$-step walk (the trace of the $k^\text{th}$ power of the random walk matrix) as well as the estimation of PageRank. We complement our algorithm with a strong impossibility result for directed graphs.

MoDELS · 人工智能 ·

2021 年 12 月 13 日

Naive probability

Zalan Gyenis,Andras Kornai

from arxiv, 8 pages

We describe a rational, but low resolution model of probability.