蜜芽亚洲精品国产品国语在线试看_欧美PLAY视频海量性欧美_正在播放一区二区欧美黑白配_亚洲国产精品激情在线观看_免费AV高清在线女人本色_美女精品一级一区二区三区_欧美z0zo人禽交播放

Lukasz Kaiser,Mohammad Babaeizadeh,Piotr Milos,Blazej Osinski,Roy H Campbell,Konrad Czechowski,Dumitru Erhan,Chelsea Finn,Piotr Kozakowski,Sergey Levine,Afroz Mohiuddin,Ryan Sepassi,George Tucker,Henryk Michalewski

Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.

相關內容

Atari

關注 0

Learning · 聯邦學習 · 得分 · 容差 · ReLU ·

2024 年 5 月 14 日

Byzantine-Resilient Secure Aggregation for Federated Learning Without Privacy Compromises

Yue Xia,Christoph Hofmeister,Maximilian Egger,Rawad Bitar

Federated learning (FL) shows great promise in large scale machine learning, but brings new risks in terms of privacy and security. We propose ByITFL, a novel scheme for FL that provides resilience against Byzantine users while keeping the users' data private from the federator and private from other users. The scheme builds on the preexisting non-private FLTrust scheme, which tolerates malicious users through trust scores (TS) that attenuate or amplify the users' gradients. The trust scores are based on the ReLU function, which we approximate by a polynomial. The distributed and privacy-preserving computation in ByITFL is designed using a combination of Lagrange coded computing, verifiable secret sharing and re-randomization steps. ByITFL is the first Byzantine resilient scheme for FL with full information-theoretic privacy.

泛化理論 · 不變 · 輸入空間 · Learning · 穩健性 ·

2024 年 5 月 14 日

Cross-Domain Feature Augmentation for Domain Generalization

Yingnan Liu,Yingtian Zou,Rui Qiao,Fusheng Liu,Mong Li Lee,Wynne Hsu

from arxiv, Accepted to the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024); Code is available at //github.com/NancyQuris/XDomainMix

Domain generalization aims to develop models that are robust to distribution shifts. Existing methods focus on learning invariance across domains to enhance model robustness, and data augmentation has been widely used to learn invariant predictors, with most methods performing augmentation in the input space. However, augmentation in the input space has limited diversity whereas in the feature space is more versatile and has shown promising results. Nonetheless, feature semantics is seldom considered and existing feature augmentation methods suffer from a limited variety of augmented features. We decompose features into class-generic, class-specific, domain-generic, and domain-specific components. We propose a cross-domain feature augmentation method named XDomainMix that enables us to increase sample diversity while emphasizing the learning of invariant representations to achieve domain generalization. Experiments on widely used benchmark datasets demonstrate that our proposed method is able to achieve state-of-the-art performance. Quantitative analysis indicates that our feature augmentation approach facilitates the learning of effective models that are invariant across different domains.

優化器 · tuning · Learning · 最優化 · 超參數 ·

2024 年 5 月 10 日

LiveTune: Dynamic Parameter Tuning for Feedback-Driven Optimization

Soheil Zibakhsh Shabgahi,Nojan Sheybani,Aiden Tabrizi,Farinaz Koushanfar

Feedback-driven optimization, such as traditional machine learning training, is a static process that lacks real-time adaptability of hyperparameters. Tuning solutions for optimization require trial and error paired with checkpointing and schedulers, in many cases feedback from the algorithm is overlooked. Adjusting hyperparameters during optimization usually requires the program to be restarted, wasting utilization and time, while placing unnecessary strain on memory and processors. We present LiveTune, a novel framework allowing real-time parameter adjustment of optimization loops through LiveVariables. Live Variables allow for continuous feedback-driven optimization by storing parameters on designated ports on the system, allowing them to be dynamically adjusted. Extensive evaluations of our framework on standard machine learning training pipelines show saving up to 60 seconds and 5.4 Kilojoules of energy per hyperparameter change. We also show the feasibility and value of LiveTune in a reinforcement learning application where the users change the dynamics of the reward structure while the agent is learning showing 5x improvement over the baseline. Finally, we outline a fully automated workflow to provide end-to-end, unsupervised feedback-driven optimization.

多任務學習 · 學成 · 可理解性 · INFORMS · 泛化理論 ·

2022 年 3 月 28 日

Multi-Task Learning for Visual Scene Understanding

Simon Vandenhende

from arxiv, PhD Thesis

Despite the recent progress in deep learning, most approaches still go for a silo-like solution, focusing on learning each task in isolation: training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. Multi-task learning (MTL) aims to leverage useful information across tasks to improve the generalization capability of a model. This thesis is concerned with multi-task learning in the context of computer vision. First, we review existing approaches for MTL. Next, we propose several methods that tackle important aspects of multi-task learning. The proposed methods are evaluated on various benchmarks. The results show several advances in the state-of-the-art of multi-task learning. Finally, we discuss several possibilities for future work.

MoDELS · 語言模型化 · 任務對話系統 · 話題 · Vision ·

2022 年 3 月 26 日

A Roadmap for Big Model

Sha Yuan,Hanyu Zhao,Shuai Zhao,Jiahong Leng,Yangxiao Liang,Xiaozhi Wang,Jifan Yu,Xin Lv,Zhou Shao,Jiaao He,Yankai Lin,Xu Han,Zhenghao Liu,Ning Ding,Yongming Rao,Yizhao Gao,Liang Zhang,Ming Ding,Cong Fang,Yisen Wang,Mingsheng Long,Jing Zhang,Yinpeng Dong,Tianyu Pang,Peng Cui,Lingxiao Huang,Zheng Liang,Huawei Shen,Hui Zhang,Quanshi Zhang,Qingxiu Dong,Zhixing Tan,Mingxuan Wang,Shuo Wang,Long Zhou,Haoran Li,Junwei Bao,Yingwei Pan,Weinan Zhang,Zhou Yu,Rui Yan,Chence Shi,Minghao Xu,Zuobai Zhang,Guoqiang Wang,Xiang Pan,Mengjie Li,Xiaoyu Chu,Zijun Yao,Fangwei Zhu,Shulin Cao,Weicheng Xue,Zixuan Ma,Zhengyan Zhang,Shengding Hu,Yujia Qin,Chaojun Xiao,Zheni Zeng,Ganqu Cui,Weize Chen,Weilin Zhao,Yuan Yao,Peng Li,Wenzhao Zheng,Wenliang Zhao,Ziyi Wang,Borui Zhang,Nanyi Fei,Anwen Hu,Zenan Ling,Haoyang Li,Boxi Cao,Xianpei Han,Weidong Zhan,Baobao Chang,Hao Sun,Jiawen Deng,Juanzi Li,Lei Hou,Xigang Cao,Jidong Zhai,Zhiyuan Liu,Maosong Sun,Jiwen Lu,Zhiwu Lu,Qin Jin,Ruihua Song,Ji-Rong Wen,Zhouchen Lin,Liwei Wang,Hang Su,Jun Zhu,Zhifang Sui,Jiajun Zhang,Yang Liu,Xiaodong He,Minlie Huang,Jian Tang,Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

循環網絡 · Networking · 數據生成過程 · 聯合分布 · state-of-the-art ·

2020 年 12 月 24 日

Memory-Gated Recurrent Networks

Yaquan Zhang,Qi Wu,Nanbo Peng,Min Dai,Jing Zhang,Hu Wang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint" memory). Because of the multivariate complexity in the evolution of the joint distribution that underlies the data generating process, we take a data-driven approach and construct a novel recurrent network architecture, termed Memory-Gated Recurrent Networks (mGRN), with gates explicitly regulating two distinct types of memories: the marginal memory and the joint memory. Through a combination of comprehensive simulation studies and empirical experiments on a range of public datasets, we show that our proposed mGRN architecture consistently outperforms state-of-the-art architectures targeting multivariate time series.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

INFORMS · 表示學習 · Extensibility · 模型評估 · 圖 ·

2019 年 12 月 28 日

Rule-Guided Compositional Representation Learning on Knowledge Graphs

Guanglin Niu,Yongfei Zhang,Bo Li,Peng Cui,Si Liu,Jingyang Li,Xiaowei Zhang

from arxiv, The full version of a paper accepted to AAAI 2020

Representation learning on a knowledge graph (KG) is to embed entities and relations of a KG into low-dimensional continuous vector spaces. Early KG embedding methods only pay attention to structured information encoded in triples, which would cause limited performance due to the structure sparseness of KGs. Some recent attempts consider paths information to expand the structure of KGs but lack explainability in the process of obtaining the path representations. In this paper, we propose a novel Rule and Path-based Joint Embedding (RPJE) scheme, which takes full advantage of the explainability and accuracy of logic rules, the generalization of KG embedding as well as the supplementary semantic structure of paths. Specifically, logic rules of different lengths (the number of relations in rule body) in the form of Horn clauses are first mined from the KG and elaborately encoded for representation learning. Then, the rules of length 2 are applied to compose paths accurately while the rules of length 1 are explicitly employed to create semantic associations among relations and constrain relation embeddings. Besides, the confidence level of each rule is also considered in optimization to guarantee the availability of applying the rule to representation learning. Extensive experimental results illustrate that RPJE outperforms other state-of-the-art baselines on KG completion task, which also demonstrate the superiority of utilizing logic rules as well as paths for improving the accuracy and explainability of representation learning.

小樣本學習 · 學成 · 示例 · 泛函 · 訓練實例 ·

2018 年 12 月 10 日

Learning Embedding Adaptation for Few-Shot Learning

Han-Jia Ye,Hexiang Hu,De-Chuan Zhan,Fei Sha

Learning with limited data is a key challenge for visual recognition. Few-shot learning methods address this challenge by learning an instance embedding function from seen classes and apply the function to instances from unseen classes with limited labels. This style of transfer learning is task-agnostic: the embedding function is not learned optimally discriminative with respect to the unseen classes, where discerning among them is the target task. In this paper, we propose a novel approach to adapt the embedding model to the target classification task, yielding embeddings that are task-specific and are discriminative. To this end, we employ a type of self-attention mechanism called Transformer to transform the embeddings from task-agnostic to task-specific by focusing on relating instances from the test instances to the training instances in both seen and unseen classes. Our approach also extends to both transductive and generalized few-shot classification, two important settings that have essential use cases. We verify the effectiveness of our model on two standard benchmark few-shot classification datasets --- MiniImageNet and CUB, where our approach demonstrates state-of-the-art empirical performance.

SOFT · Continuity · Better · Performer · state-of-the-art ·

2018 年 4 月 25 日

Multiagent Soft Q-Learning

Ermo Wei,Drew Wicke,David Freelan,Sean Luke

from arxiv, Accepted in AAAI 18 Spring Symposium

Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous controls. We compare our method to MADDPG, a state-of-the-art approach, and show that our method achieves better coordination in multiagent cooperative tasks, converging to better local optima in the joint action space.