精品自在线观看影片天天看_日韩纯肉无遮挡一区二区视频_黄又色又爽又黄又刺激视频_2021国产成人综合亚洲精品_A级小黄片在线免费播放_人妻少妇精品视频一区97精品_精品日韩国产欧美在线观看

We study the performance of the gradient play algorithm for stochastic games (SGs), where each agent tries to maximize its own total discounted reward by making decisions independently based on current state information which is shared between agents. Policies are directly parameterized by the probability of choosing a certain action at a given state. We show that Nash equilibria (NEs) and first-order stationary policies are equivalent in this setting, and give a local convergence rate around strict NEs. Further, for a subclass of SGs called Markov potential games (which includes the setting with identical rewards as an important special case), we design a sample-based reinforcement learning algorithm and give a non-asymptotic global convergence rate analysis for both exact gradient play and our sample-based learning algorithm. Our result shows that the number of iterations to reach an $\epsilon$-NE scales linearly, instead of exponentially, with the number of agents. Local geometry and local stability are also considered, where we prove that strict NEs are local maxima of the total potential function and fully-mixed NEs are saddle points.

相關內容

平穩的

關注 0

ILP · Learning · 歸納邏輯程序設計 · 可辨認的 · 可約的 ·

2024 年 1 月 29 日

Learning logic programs by finding minimal unsatisfiable subprograms

Andrew Cropper,Céline Hocquette

The goal of inductive logic programming (ILP) is to search for a logic program that generalises training examples and background knowledge. We introduce an ILP approach that identifies minimal unsatisfiable subprograms (MUSPs). We show that finding MUSPs allows us to efficiently and soundly prune the search space. Our experiments on multiple domains, including program synthesis and game playing, show that our approach can reduce learning times by 99%.

異常檢測 · 樣例 · 應用統計 ·

2024 年 1 月 29 日

anomaly : Detection of Anomalous Structure in Time Series Data

Alex Fisch,Daniel Grose,Idris A. Eckley,Paul Fearnhead,Lawrence Bardwell

from arxiv, 24 pages, 6 figures. An R package that implements the methods discussed in the paper can be obtained from The Comprehensive R Archive Network (CRAN) via //cran.r-project.org/web/packages/anomaly/index.html

One of the contemporary challenges in anomaly detection is the ability to detect, and differentiate between, both point and collective anomalies within a data sequence or time series. The anomaly package has been developed to provide users with a choice of anomaly detection methods and, in particular, provides an implementation of the recently proposed Collective And Point Anomaly family of anomaly detection algorithms. This article describes the methods implemented whilst also highlighting their application to simulated data as well as real data examples contained in the package.

對數幾率回歸 · 嶺回歸 · MoDELS · 超參數 · 縮放 ·

2024 年 1 月 28 日

Prevalidated ridge regression is a highly-efficient drop-in replacement for logistic regression for high-dimensional data

Angus Dempster,Geoffrey I. Webb,Daniel F. Schmidt

from arxiv, 13 pages, 11 figures

Logistic regression is a ubiquitous method for probabilistic classification. However, the effectiveness of logistic regression depends upon careful and relatively computationally expensive tuning, especially for the regularisation hyperparameter, and especially in the context of high-dimensional data. We present a prevalidated ridge regression model that closely matches logistic regression in terms of classification error and log-loss, particularly for high-dimensional data, while being significantly more computationally efficient and having effectively no hyperparameters beyond regularisation. We scale the coefficients of the model so as to minimise log-loss for a set of prevalidated predictions derived from the estimated leave-one-out cross-validation error. This exploits quantities already computed in the course of fitting the ridge regression model in order to find the scaling parameter with nominal additional computational expense.

有限差分 · 時間步 · 正則化項 · BASIC · 樣例 ·

2024 年 1 月 26 日

An explicit-implicit Generalized Finite Difference scheme for a parabolic-elliptic density-suppressed motility system

Federico Herrero-Hervás

In this work, a Generalized Finite Difference (GFD) scheme is presented for effectively computing the numerical solution of a parabolic-elliptic system modelling a bacterial strain with density-suppressed motility. The GFD method is a meshless method known for its simplicity for solving non-linear boundary value problems over irregular geometries. The paper first introduces the basic elements of the GFD method, and then an explicit-implicit scheme is derived. The convergence of the method is proven under a bound for the time step, and an algorithm is provided for its computational implementation. Finally, some examples are considered comparing the results obtained with a regular mesh and an irregular cloud of points.

統計量 · 控制器 · 邦弗朗尼校正 · Conformer · 相關系數 ·

2024 年 1 月 25 日

A powerful rank-based correction to multiple testing under positive dependency

Alexander Timans,Christoph-Nikolas Straehle,Kaspar Sakmann,Eric Nalisnick

from arxiv, 12 pages, 3 figures; Note: We have been made aware of our proposal being highly related and/or identical to the Westfall-Young multiple testing procedure - we are currently investigating this connection

We develop a novel multiple hypothesis testing correction with family-wise error rate (FWER) control that efficiently exploits positive dependencies between potentially correlated statistical hypothesis tests. Our proposed algorithm $\texttt{max-rank}$ is conceptually straight-forward, relying on the use of a $\max$-operator in the rank domain of computed test statistics. We compare our approach to the frequently employed Bonferroni correction, theoretically and empirically demonstrating its superiority over Bonferroni in the case of existing positive dependency, and its equivalence otherwise. Our advantage over Bonferroni increases as the number of tests rises, and we maintain high statistical power whilst ensuring FWER control. We specifically frame our algorithm in the context of parallel permutation testing, a scenario that arises in our primary application of conformal prediction, a recently popularized approach for quantifying uncertainty in complex predictive settings.

正則化項 · 情景 · Analysis · 值域 · 正則化 ·

2024 年 1 月 25 日

A regularized variance-reduced modified extragradient method for stochastic hierarchical games

Shisheng Cui,Uday V. Shanbhag,Mathias Staudigl

from arxiv, Largely revised version, added application to virtual power plants, submitted for publication

We consider an N-player hierarchical game in which the i-th player's objective comprises of an expectation-valued term, parametrized by rival decisions, and a hierarchical term. Such a framework allows for capturing a broad range of stochastic hierarchical optimization problems, Stackelberg equilibrium problems, and leader-follower games. We develop an iteratively regularized and smoothed variance-reduced modified extragradient framework for iteratively approaching hierarchical equilibria in a stochastic setting. We equip our analysis with rate statements, complexity guarantees, and almost-sure convergence results. We then extend these statements to settings where the lower-level problem is solved inexactly and provide the corresponding rate and complexity statements. Our model framework encompasses many game theoretic equilibrium problems studied in the context of power markets. We present a realistic application to the virtual power plants, emphasizing the role of hierarchical decision making and regularization.

秩 · Principle · Learning · search engine · INFORMS ·

2024 年 1 月 25 日

The Search for Stability: Learning Dynamics of Strategic Publishers with Initial Documents

Omer Madmon,Idan Pipano,Itamar Reinman,Moshe Tennenholtz

We study a game-theoretic information retrieval model in which strategic publishers aim to maximize their chances of being ranked first by the search engine while maintaining the integrity of their original documents. We show that the commonly used Probability Ranking Principle (PRP) ranking scheme results in an unstable environment where games often fail to reach pure Nash equilibrium. We propose the Relative Ranking Principle (RRP) as an alternative ranking principle and introduce two families of ranking functions that are instances of the RRP. We provide both theoretical and empirical evidence that these methods lead to a stable search ecosystem, by providing positive results on the learning dynamics convergence. We also define the publishers' and users' welfare, demonstrate a possible publisher-user trade-off, and provide means for a search system designer to control it. Finally, we show how instability harms long-term users' welfare.

MoDELS · 蒙特卡羅 · 樣本 · 聯合分布 · 自回歸生成模型 ·

2024 年 1 月 25 日

MCCE: Monte Carlo sampling of realistic counterfactual explanations

Annabelle Redelmeier,Martin Jullum,Kjersti Aas,Anders L?land

We introduce MCCE: Monte Carlo sampling of valid and realistic Counterfactual Explanations for tabular data, a novel counterfactual explanation method that generates on-manifold, actionable and valid counterfactuals by modeling the joint distribution of the mutable features given the immutable features and the decision. Unlike other on-manifold methods that tend to rely on variational autoencoders and have strict prediction model and data requirements, MCCE handles any type of prediction model and categorical features with more than two levels. MCCE first models the joint distribution of the features and the decision with an autoregressive generative model where the conditionals are estimated using decision trees. Then, it samples a large set of observations from this model, and finally, it removes the samples that do not obey certain criteria. We compare MCCE with a range of state-of-the-art on-manifold counterfactual methods using four well-known data sets and show that MCCE outperforms these methods on all common performance metrics and speed. In particular, including the decision in the modeling process improves the efficiency of the method substantially.

MoDELS · 數學 · 泛函 · 應用統計 ·

2024 年 1 月 19 日

New model for the decision-making and the game style in football

Brahim Boudine

We propose a new mathematical model for the decision-making of players in football (soccer) and the efficiency of the game style. Our approach is based on $4$-networks, which is a mathematical concept that we introduce. The decision of players is expressed by a mathematical function depending on the game style chosen by the coach. Moreover, we measure the efficiency of the game style by a sequence of 4-networks.

MoDELS · Better · Vision · Processing（編程語言） · 自然語言處理 ·

2022 年 2 月 21 日

VLP: A Survey on Vision-Language Pre-training

Feilong Chen,Duzhen Zhang,Minglun Han,Xiuyi Chen,Jing Shi,Shuang Xu,Bo Xu

from arxiv, A Survey on Vision-Language Pre-training

In the past few years, the emergence of pre-training models has brought uni-modal fields such as computer vision (CV) and natural language processing (NLP) to a new era. Substantial works have shown they are beneficial for downstream uni-modal tasks and avoid training a new model from scratch. So can such pre-trained models be applied to multi-modal tasks? Researchers have explored this problem and made significant progress. This paper surveys recent advances and new frontiers in vision-language pre-training (VLP), including image-text and video-text pre-training. To give readers a better overall grasp of VLP, we first review its recent advances from five aspects: feature extraction, model architecture, pre-training objectives, pre-training datasets, and downstream tasks. Then, we summarize the specific VLP models in detail. Finally, we discuss the new frontiers in VLP. To the best of our knowledge, this is the first survey on VLP. We hope that this survey can shed light on future research in the VLP field.