亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='1gf50'></tfoot>

<legend id='1gf50'><style id='1gf50'><dir id='1gf50'><q id='1gf50'></q></dir></style></legend>

<i id='1gf50'><tr id='1gf50'><dt id='1gf50'><q id='1gf50'><span id='1gf50'><b id='1gf50'><form id='1gf50'><ins id='1gf50'></ins><ul id='1gf50'></ul><sub id='1gf50'></sub></form><legend id='1gf50'></legend><bdo id='1gf50'><pre id='1gf50'><center id='1gf50'></center></pre></bdo></b><th id='1gf50'></th></span></q></dt></tr></i><div id='1gf50'><tfoot id='1gf50'></tfoot><dl id='1gf50'><fieldset id='1gf50'></fieldset></dl></div>

<li id='1gf50'><abbr id='1gf50'></abbr></li>

·

賭博機/老虎機 · Packing · 線性的 · 向量化 · 估計/估計量 ·

2023 年 5 月 31 日

Improved Algorithms for Multi-period Multi-class Packing Problems with Bandit Feedback

Wonyoung Kim,Garud Iyengar,Assaf Zeevi

from arxiv, Accepted in ICML 2023, 44 pages including Appendix

We consider the linear contextual multi-class multi-period packing problem (LMMP) where the goal is to pack items such that the total vector of consumption is below a given budget vector and the total value is as large as possible. We consider the setting where the reward and the consumption vector associated with each action is a class-dependent linear function of the context, and the decision-maker receives bandit feedback. LMMP includes linear contextual bandits with knapsacks and online revenue management as special cases. We establish a new estimator which guarantees a faster convergence rate, and consequently, a lower regret in such problems. We propose a bandit policy that is a closed-form function of said estimated parameters. When the contexts are non-degenerate, the regret of the proposed policy is sublinear in the context dimension, the number of classes, and the time horizon $T$ when the budget grows at least as $\sqrt{T}$. We also resolve an open problem posed by Agrawal & Devanur (2016) and extend the result to a multi-class setting. Our numerical experiments clearly demonstrate that the performance of our policy is superior to other benchmarks in the literature.

相關內容

賭博機/老虎機

賭博機/老虎機

INFORMS · 泛函 · 估計/估計量 · 推斷 · 可辨認的 ·

2023 年 7 月 21 日

Estimating and using information in inverse problems

Wolfgang Bangerth,Chris R. Johnson,Dennis K. Njeru,Bart van Bloemen Waanders

In inverse problems, one attempts to infer spatially variable functions from indirect measurements of a system. To practitioners of inverse problems, the concept of "information" is familiar when discussing key questions such as which parts of the function can be inferred accurately and which cannot. For example, it is generally understood that we can identify system parameters accurately only close to detectors, or along ray paths between sources and detectors, because we have "the most information" for these places. Although referenced in many publications, the "information" that is invoked in such contexts is not a well understood and clearly defined quantity. Herein, we present a definition of information density that is based on the variance of coefficients as derived from a Bayesian reformulation of the inverse problem. We then discuss three areas in which this information density can be useful in practical algorithms for the solution of inverse problems, and illustrate the usefulness in one of these areas -- how to choose the discretization mesh for the function to be reconstructed -- using numerical experiments.

穩健性 · 優化器 · Learning · 強化學習 · 貝爾曼最優方程 ·

2023 年 7 月 20 日

Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty

Guanlin Liu,Zhihan Zhou,Han Liu,Lifeng Lai

Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which, instead of always carrying out the action specified by the policy, the agent will take the action specified by the policy with probability $1-\rho$ and an alternative adversarial action with probability $\rho$. We establish the existence of an optimal policy on the action robust MDPs with probabilistic policy execution uncertainty and provide the action robust Bellman optimality equation for its solution. Furthermore, we develop Action Robust Reinforcement Learning with Certificates (ARRLC) algorithm that achieves minimax optimal regret and sample complexity. Furthermore, we conduct numerical experiments to validate our approach's robustness, demonstrating that ARRLC outperforms non-robust RL algorithms and converges faster than the robust TD algorithm in the presence of action perturbations.

正則化項 · Weight · 可辨認的 · 設計 · 特化 ·

2023 年 7 月 19 日

Weighted inhomogeneous regularization for inverse problems with indirect and incomplete measurement data

Bosu Choi,Jihun Han,Yoonsang Lee

Regularization promotes well-posedness in solving an inverse problem with incomplete measurement data. The regularization term is typically designed based on a priori characterization of the unknown signal, such as sparsity or smoothness. The standard inhomogeneous regularization incorporates a spatially changing exponent $p$ of the standard $\ell_p$ norm-based regularization to recover a signal whose characteristic varies spatially. This study proposes a weighted inhomogeneous regularization that extends the standard inhomogeneous regularization through new exponent design and weighting using spatially varying weights. The new exponent design avoids misclassification when different characteristics stay close to each other. The weights handle another issue when the region of one characteristic is too small to be recovered effectively by the $\ell_p$ norm-based regularization even after identified correctly. A suite of numerical tests shows the efficacy of the proposed weighted inhomogeneous regularization, including synthetic image experiments and real sea ice recovery from its incomplete wave measurements.

路徑 · Weight · 標注 · 優化器 · Continuity ·

2023 年 7 月 19 日

Labeling Methods for Partially Ordered Paths

Ricardo Euler,Pedro Maristany de las Casas

The landscape of applications and subroutines relying on shortest path computations continues to grow steadily. This growth is driven by the undeniable success of shortest path algorithms in theory and practice. It also introduces new challenges as the models and assessing the optimality of paths become more complicated. Hence, multiple recent publications in the field adapt existing labeling methods in an ad-hoc fashion to their specific problem variant without considering the underlying general structure: they always deal with multi-criteria scenarios and those criteria define different partial orders on the paths. In this paper, we introduce the partial order shortest path problem (POSP), a generalization of the multi-objective shortest path problem (MOSP) and in turn also of the classical shortest path problem. POSP captures the particular structure of many shortest path applications as special cases. In this generality, we study optimality conditions or the lack of them, depending on the objective functions' properties. Our final contribution is a big lookup table summarizing our findings and providing the reader an easy way to choose among the most recent multicriteria shortest path algorithms depending on their problem's weight structure. Examples range from time-dependent shortest path and bottleneck path problems to the fuzzy shortest path problem and complex financial weight functions studied in the public transportation community. Our results hold for general digraphs and therefore surpass previous generalizations that were limited to acyclic graphs.

優化器 · Learning · 可交換的 · 損失 · 經驗損失 ·

2023 年 7 月 19 日

Network-GIANT: Fully distributed Newton-type optimization via harmonic Hessian consensus

Alessio Maritan,Ganesh Sharma,Luca Schenato,Subhrakanti Dey

This paper considers the problem of distributed multi-agent learning, where the global aim is to minimize a sum of local objective (empirical loss) functions through local optimization and information exchange between neighbouring nodes. We introduce a Newton-type fully distributed optimization algorithm, Network-GIANT, which is based on GIANT, a Federated learning algorithm that relies on a centralized parameter server. The Network-GIANT algorithm is designed via a combination of gradient-tracking and a Newton-type iterative algorithm at each node with consensus based averaging of local gradient and Newton updates. We prove that our algorithm guarantees semi-global and exponential convergence to the exact solution over the network assuming strongly convex and smooth loss functions. We provide empirical evidence of the superior convergence performance of Network-GIANT over other state-of-art distributed learning algorithms such as Network-DANE and Newton-Raphson Consensus.

Learning · Neural Networks · Networking · 可約的 · Networks ·

2022 年 9 月 1 日

Learning with Differentiable Algorithms

from arxiv, PhD thesis (summa cum laude), University of Konstanz, 162 pages

Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path in a large graph, neural networks allow learning from data to predict the most likely answer in more complex tasks such as image classification, which cannot be reduced to an exact algorithm. To get the best of both worlds, this thesis explores combining both concepts leading to more robust, better performing, more interpretable, more computationally efficient, and more data efficient architectures. The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm. When integrating an algorithm into a neural architecture, it is important that the algorithm is differentiable such that the architecture can be trained end-to-end and gradients can be propagated back through the algorithm in a meaningful way. To make algorithms differentiable, this thesis proposes a general method for continuously relaxing algorithms by perturbing variables and approximating the expectation value in closed form, i.e., without sampling. In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable renderers, and differentiable logic gate networks. Finally, this thesis presents alternative training strategies for learning with algorithms.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.

多任務學習 · Performer · 學成 · INFORMS · 泛化理論 ·

2021 年 3 月 29 日

A Survey on Multi-Task Learning

Yu Zhang,Qiang Yang

from arxiv, Accepted by IEEE Transactions on Knowledge and Data Engineering

Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks. In this paper, we give a survey for MTL from the perspective of algorithmic modeling, applications and theoretical analyses. For algorithmic modeling, we give a definition of MTL and then classify different MTL algorithms into five categories, including feature learning approach, low-rank approach, task clustering approach, task relation learning approach and decomposition approach as well as discussing the characteristics of each approach. In order to improve the performance of learning tasks further, MTL can be combined with other learning paradigms including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, multi-view learning and graphical models. When the number of tasks is large or the data dimensionality is high, we review online, parallel and distributed MTL models as well as dimensionality reduction and feature hashing to reveal their computational and storage advantages. Many real-world applications use MTL to boost their performance and we review representative works in this paper. Finally, we present theoretical analyses and discuss several future directions for MTL.

超參數 · Performer · Weight · 集成 · 穩健性 ·

2020 年 6 月 24 日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Florian Wenzel,Jasper Snoek,Dustin Tran,Rodolphe Jenatton

Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.

損失函數（機器學習） · 學習的學習 · 學成 · entity · 泛函 ·

2019 年 9 月 9 日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Jiawei Wu,Wenhan Xiong,William Yang Wang

from arxiv, 11pages, 5 figures, accepted to EMNLP 2019

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

賭博機/老虎機

向量(liang)化(hua)

估計/估計量

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<dir id='1gf50'><del id='1gf50'><del id='1gf50'></del><pre id='1gf50'><pre id='1gf50'><option id='1gf50'><address id='1gf50'></address><bdo id='1gf50'><tr id='1gf50'><acronym id='1gf50'><pre id='1gf50'></pre></acronym><div id='1gf50'></div></tr></bdo></option></pre><small id='1gf50'><address id='1gf50'><u id='1gf50'><legend id='1gf50'><option id='1gf50'><abbr id='1gf50'></abbr><li id='1gf50'><pre id='1gf50'></pre></li></option></legend><select id='1gf50'></select></u></address></small></pre></del><sup id='1gf50'></sup><blockquote id='1gf50'><dt id='1gf50'></dt></blockquote><blockquote id='1gf50'></blockquote></dir><tt id='1gf50'></tt><u id='1gf50'><tt id='1gf50'><form id='1gf50'></form></tt><td id='1gf50'><dt id='1gf50'></dt></td></u>

<code id='1gf50'><i id='1gf50'><q id='1gf50'><legend id='1gf50'><pre id='1gf50'><style id='1gf50'><acronym id='1gf50'><i id='1gf50'><form id='1gf50'><option id='1gf50'><center id='1gf50'></center></option></form></i></acronym></style><tt id='1gf50'></tt></pre></legend></q></i></code><center id='1gf50'></center>

<dd id='1gf50'></dd>

<style id='1gf50'></style><sub id='1gf50'><dfn id='1gf50'><abbr id='1gf50'><big id='1gf50'><bdo id='1gf50'></bdo></big></abbr></dfn></sub>_{<dir id='1gf50'></dir>}