2020久久精品亚洲热综合,国产精品久久久久一级毛片,青草青草久热精品视频网,狠狠色狠狠色狠狠五月毛片

Recent work has uncovered close links between between classical reinforcement learning algorithms, Bayesian filtering, and Active Inference which lets us understand value functions in terms of Bayesian posteriors. An alternative, but less explored, model-free RL algorithm is the successor representation, which expresses the value function in terms of a successor matrix of expected future state occupancies. In this paper, we derive the probabilistic interpretation of the successor representation in terms of Bayesian filtering and thus design a novel active inference agent architecture utilizing successor representations instead of model-based planning. We demonstrate that active inference successor representations have significant advantages over current active inference agents in terms of planning horizon and computational cost. Moreover, we demonstrate how the successor representation agent can generalize to changing reward functions such as variants of the expected free energy.

相關內容

Agent

關注 15

Learning · Continuity · 控制器 · MoDELS · 隨機性策略 ·

2022 年 9 月 16 日

Learning Policies for Continuous Control via Transition Models

Justus Huebotter,Serge Thill,Marcel van Gerven,Pablo Lanillos

It is doubtful that animals have perfect inverse models of their limbs (e.g., what muscle contraction must be applied to every joint to reach a particular location in space). However, in robot control, moving an arm's end-effector to a target position or along a target trajectory requires accurate forward and inverse models. Here we show that by learning the transition (forward) model from interaction, we can use it to drive the learning of an amortized policy. Hence, we revisit policy optimization in relation to the deep active inference framework and describe a modular neural network architecture that simultaneously learns the system dynamics from prediction errors and the stochastic policy that generates suitable continuous control commands to reach a desired reference position. We evaluated the model by comparing it against the baseline of a linear quadratic regulator, and conclude with additional steps to take toward human-like motor control.

Analysis · 不變 · 近似 · 似然 · 向量空間 ·

2022 年 9 月 15 日

On the detrimental effect of invariances in the likelihood for variational inference

Richard Kurle,Ralf Herbrich,Tim Januschowski,Yuyang Wang,Jan Gasthaus

Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.

Continuity · Performer · 時間步 · 曲率 · 有向 ·

2022 年 9 月 15 日

Asymptotically preserving particle methods for strongly magnetizedplasmas in a torus

Francis Filbet,Luis Miguel Miguel Rodrigues

We propose and analyze a class of particle methods for the Vlasov equation with a strong external magnetic field in a torus configuration. In this regime, the time step can be subject to stability constraints related to the smallness of Larmor radius. To avoid this limitation, our approach is based on higher-order semi-implicit numerical schemes already validated on dissipative systems [3] and for magnetic fields pointing in a fixed direction [9, 10, 12]. It hinges on asymptotic insights gained in [11] at the continuous level. Thus, when the magnitude of the external magnetic field is large, this scheme provides a consistent approximation of the guiding-center system taking into account curvature and variation of the magnetic field. Finally, we carry out a theoretical proof of consistency and perform several numerical experiments that establish a solid validation of the method and its underlying concepts.

優化器 · 特化 · 控制器 · 表示 · 可辨認的 ·

2022 年 9 月 15 日

Sparsity Inducing Representations for Policy Decompositions

Ashwin Khadke,Hartmut Geyer

Policy Decomposition (PoDec) is a framework that lessens the curse of dimensionality when deriving policies to optimal control problems. For a given system representation, i.e. the state variables and control inputs describing a system, PoDec generates strategies to decompose the joint optimization of policies for all control inputs. Thereby, policies for different inputs are derived in a decoupled or cascaded fashion and as functions of some subsets of the state variables, leading to reduction in computation. However, the choice of system representation is crucial as it dictates the suboptimality of the resulting policies. We present a heuristic method to find a representation more amenable to decomposition. Our approach is based on the observation that every decomposition enforces a sparsity pattern in the resulting policies at the cost of optimality and a representation that already leads to a sparse optimal policy is likely to produce decompositions with lower suboptimalities. As the optimal policy is not known we construct a system representation that sparsifies its LQR approximation. For a simplified biped, a 4 degree-of-freedom manipulator, and a quadcopter, we discover decompositions that offer 10% reduction in trajectory costs over those identified by vanilla PoDec. Moreover, the decomposition policies produce trajectories with substantially lower costs compared to policies obtained from state-of-the-art reinforcement learning algorithms.

Learning · Analysis · 表示學習 · 泛化理論 · Seven ·

2022 年 9 月 15 日

Generalized Representations Learning for Time Series Classification

Wang Lu,Jindong Wang,Xinwei Sun,Yiqiang Chen,Xing Xie

from arxiv, Technical report; 18 pages

Time series classification is an important problem in real world. Due to its non-stationary property that the distribution changes over time, it remains challenging to build models for generalization to unseen distributions. In this paper, we propose to view the time series classification problem from the distribution perspective. We argue that the temporal complexity attributes to the unknown latent distributions within. To this end, we propose DIVERSIFY to learn generalized representations for time series classification. DIVERSIFY takes an iterative process: it first obtains the worst-case distribution scenario via adversarial training, then matches the distributions of the obtained sub-domains. We also present some theoretical insights. We conduct experiments on gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition with a total of seven datasets in different settings. Results demonstrate that DIVERSIFY significantly outperforms other baselines and effectively characterizes the latent distributions by qualitative and quantitative analysis.

Learning · 語言模型化 · contrastive · 對比學習 · INFORMS ·

2022 年 9 月 15 日

CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations

Borun Chen,Hongyin Tang,Jiahao Bu,Kai Zhang,Jingang Wang,Qifan Wang,Hai-Tao Zheng,Wei Wu,Liqian Yu

from arxiv, Accepted in COLING 2022

Pre-trained Language Models (PLMs) have achieved remarkable performance gains across numerous downstream tasks in natural language understanding. Various Chinese PLMs have been successively proposed for learning better Chinese language representation. However, most current models use Chinese characters as inputs and are not able to encode semantic information contained in Chinese words. While recent pre-trained models incorporate both words and characters simultaneously, they usually suffer from deficient semantic interactions and fail to capture the semantic relation between words and characters. To address the above issues, we propose a simple yet effective PLM CLOWER, which adopts the Contrastive Learning Over Word and charactER representations. In particular, CLOWER implicitly encodes the coarse-grained information (i.e., words) into the fine-grained representations (i.e., characters) through contrastive learning on multi-grained information. CLOWER is of great value in realistic scenarios since it can be easily incorporated into any existing fine-grained based PLMs without modifying the production pipelines.Extensive experiments conducted on a range of downstream tasks demonstrate the superior performance of CLOWER over several state-of-the-art baselines.

Networks · Networking · 學成 · 可辨認的 · 可理解性 ·

2021 年 5 月 17 日

The Confluence of Networks, Games and Learning

Tao Li,Guanze Peng,Quanyan Zhu,Tamer Basar

from arxiv, The manuscript has been submitted to IEEE control system magazine under review, as part of the special issue "Distributed Nash Equilibrium Seeking over Networks"

Recent years have witnessed significant advances in technologies and services in modern network applications, including smart grid management, wireless communication, cybersecurity as well as multi-agent autonomous systems. Considering the heterogeneous nature of networked entities, emerging network applications call for game-theoretic models and learning-based approaches in order to create distributed network intelligence that responds to uncertainties and disruptions in a dynamic or an adversarial environment. This paper articulates the confluence of networks, games and learning, which establishes a theoretical underpinning for understanding multi-agent decision-making over networks. We provide an selective overview of game-theoretic learning algorithms within the framework of stochastic approximation theory, and associated applications in some representative contexts of modern network systems, such as the next generation wireless communication networks, the smart grid and distributed machine learning. In addition to existing research works on game-theoretic learning over networks, we highlight several new angles and research endeavors on learning in games that are related to recent developments in artificial intelligence. Some of the new angles extrapolate from our own research interests. The overall objective of the paper is to provide the reader a clear picture of the strengths and challenges of adopting game-theoretic learning methods within the context of network systems, and further to identify fruitful future research directions on both theoretical and applied studies.

有向 · Extensibility · Processing（編程語言） · MoDELS · 編譯器 ·

2020 年 12 月 16 日

Communicative Message Passing for Inductive Relation Reasoning

Sijie Mai,Shuangjia Zheng,Yuedong Yang,Haifeng Hu

from arxiv, Accepted by AAAI-2021

Relation prediction for knowledge graphs aims at predicting missing relationships between entities. Despite the importance of inductive relation prediction, most previous works are limited to a transductive setting and cannot process previously unseen entities. The recent proposed subgraph-based relation reasoning models provided alternatives to predict links from the subgraph structure surrounding a candidate triplet inductively. However, we observe that these methods often neglect the directed nature of the extracted subgraph and weaken the role of relation information in the subgraph modeling. As a result, they fail to effectively handle the asymmetric/anti-symmetric triplets and produce insufficient embeddings for the target triplets. To this end, we introduce a \textbf{C}\textbf{o}mmunicative \textbf{M}essage \textbf{P}assing neural network for \textbf{I}nductive re\textbf{L}ation r\textbf{E}asoning, \textbf{CoMPILE}, that reasons over local directed subgraph structures and has a vigorous inductive bias to process entity-independent semantic relations. In contrast to existing models, CoMPILE strengthens the message interactions between edges and entitles through a communicative kernel and enables a sufficient flow of relation information. Moreover, we demonstrate that CoMPILE can naturally handle asymmetric/anti-symmetric relations without the need for explosively increasing the number of model parameters by extracting the directed enclosing subgraphs. Extensive experiments show substantial performance gains in comparison to state-of-the-art methods on commonly used benchmark datasets with variant inductive settings.

結構化學習 · 圖 · 學成 · 匯聚 · Neural Networks ·

2019 年 11 月 14 日

Hierarchical Graph Pooling with Structure Learning

Zhen Zhang,Jiajun Bu,Martin Ester,Jianfeng Zhang,Chengwei Yao,Zhi Yu,Can Wang

from arxiv, Accepted to AAAI-2020; Code is available at //github.com/cszhangzhen/HGP-SL

Graph Neural Networks (GNNs), which generalize deep neural networks to graph-structured data, have drawn considerable attention and achieved state-of-the-art performance in numerous graph related tasks. However, existing GNN models mainly focus on designing graph convolution operations. The graph pooling (or downsampling) operations, that play an important role in learning hierarchical representations, are usually overlooked. In this paper, we propose a novel graph pooling operator, called Hierarchical Graph Pooling with Structure Learning (HGP-SL), which can be integrated into various graph neural network architectures. HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs. More specifically, the graph pooling operation adaptively selects a subset of nodes to form an induced subgraph for the subsequent layers. To preserve the integrity of graph's topological information, we further introduce a structure learning mechanism to learn a refined graph structure for the pooled graph at each layer. By combining HGP-SL operator with graph neural networks, we perform graph level representation learning with focus on graph classification task. Experimental results on six widely used benchmarks demonstrate the effectiveness of our proposed model.

分布式機器學習 · Machine Learning · 學成 · Storage · 優化器 ·

2019 年 9 月 18 日

Distributed Machine Learning on Mobile Devices: A Survey

Renjie Gu,Shuo Yang,Fan Wu

In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users' privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users' sensitive information. Another benefit is the bandwidth reduction, as various kinds of local data can now participate in the training process without being uploaded to the server. In this paper, we provide a comprehensive survey on recent studies of mobile distributed machine learning. We survey a number of widely-used mobile distributed machine learning methods. We also present an in-depth discussion on the challenges and future directions in this area. We believe that this survey can demonstrate a clear overview of mobile distributed machine learning and provide guidelines on applying mobile distributed machine learning to real applications.