午夜剧场成年免费视,人妻超清中文字幕乱码一区,午夜精品人妻一区二区三区

from arxiv, [v1]: Combined main text of 6 pages, 3 figures and supplemental materials of 7 pages, 1 figure. [v2]: Accepted version, Phys. Rev. Lett. Combined main text of 8 pages, 4 figures and supplemental materials of 5 pages

We present a framework to integrate tensor network (TN) methods with reinforcement learning (RL) for solving dynamical optimisation tasks. We consider the RL actor-critic method, a model-free approach for solving RL problems, and introduce TNs as the approximators for its policy and value functions. Our "actor-critic with tensor networks" (ACTeN) method is especially well suited to problems with large and factorisable state and action spaces. As an illustration of the applicability of ACTeN we solve the exponentially hard task of sampling rare trajectories in two paradigmatic stochastic models, the East model of glasses and the asymmetric simple exclusion process (ASEP), the latter being particularly challenging to other methods due to the absence of detailed balance. With substantial potential for further integration with the vast array of existing RL methods, the approach introduced here is promising both for applications in physics and to multi-agent RL problems more generally.

相關內容

On the Correspondence of Non-flat Assumption-based Argumentation and Logic Programming with Negation as Failure in the Head

Anna Rapberger,Markus Ulbricht,Francesca Toni

The relation between (a fragment of) assumption-based argumentation (ABA) and logic programs (LPs) under stable model semantics is well-studied. However, for obtaining this relation, the ABA framework needs to be restricted to being flat, i.e., a fragment where the (defeasible) assumptions can never be entailed, only assumed to be true or false. Here, we remove this restriction and show a correspondence between non-flat ABA and LPs with negation as failure in their head. We then extend this result to so-called set-stable ABA semantics, originally defined for the fragment of non-flat ABA called bipolar ABA. We showcase how to define set-stable semantics for LPs with negation as failure in their head and show the correspondence to set-stable ABA semantics.

Transformer-XL · Learning · 機器人 · Networking · Extensibility ·

2024 年 5 月 24 日

Transformer-XL for Long Sequence Tasks in Robotic Learning from Demonstration

Gao Tianci

from arxiv, 11 pages,4 figures

This paper presents an innovative application of Transformer-XL for long sequence tasks in robotic learning from demonstrations (LfD). The proposed framework effectively integrates multi-modal sensor inputs, including RGB-D images, LiDAR, and tactile sensors, to construct a comprehensive feature vector. By leveraging the advanced capabilities of Transformer-XL, particularly its attention mechanism and position encoding, our approach can handle the inherent complexities and long-term dependencies of multi-modal sensory data. The results of an extensive empirical evaluation demonstrate significant improvements in task success rates, accuracy, and computational efficiency compared to conventional methods such as Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNNs). The findings indicate that the Transformer-XL-based framework not only enhances the robot's perception and decision-making abilities but also provides a robust foundation for future advancements in robotic learning from demonstrations.

INFORMS · 潛在 · 估計/估計量 · 互信息 · Learning ·

2024 年 5 月 24 日

Revisiting Counterfactual Regression through the Lens of Gromov-Wasserstein Information Bottleneck

Hao Yang,Zexu Sun,Hongteng Xu,Xu Chen

from arxiv, 19 pages

As a promising individualized treatment effect (ITE) estimation method, counterfactual regression (CFR) maps individuals' covariates to a latent space and predicts their counterfactual outcomes. However, the selection bias between control and treatment groups often imbalances the two groups' latent distributions and negatively impacts this method's performance. In this study, we revisit counterfactual regression through the lens of information bottleneck and propose a novel learning paradigm called Gromov-Wasserstein information bottleneck (GWIB). In this paradigm, we learn CFR by maximizing the mutual information between covariates' latent representations and outcomes while penalizing the kernelized mutual information between the latent representations and the covariates. We demonstrate that the upper bound of the penalty term can be implemented as a new regularizer consisting of $i)$ the fused Gromov-Wasserstein distance between the latent representations of different groups and $ii)$ the gap between the transport cost generated by the model and the cross-group Gromov-Wasserstein distance between the latent representations and the covariates. GWIB effectively learns the CFR model through alternating optimization, suppressing selection bias while avoiding trivial latent distributions. Experiments on ITE estimation tasks show that GWIB consistently outperforms state-of-the-art CFR methods. To promote the research community, we release our project at //github.com/peteryang1031/Causal-GWIB.

優化器 · Microsoft Surface · MoDELS · Analysis · 極大 ·

2024 年 5 月 24 日

Beyond Diagonal Reconfigurable Intelligent Surfaces with Mutual Coupling: Modeling and Optimization

Hongyu Li,Shanpu Shen,Matteo Nerini,Marco Di Renzo,Bruno Clerckx

from arxiv, 5 pages, 3 figures, accepted by IEEE Commun. Lett

This work studies the modeling and optimization of beyond diagonal reconfigurable intelligent surface (BD-RIS) aided wireless communication systems in the presence of mutual coupling among the RIS elements. Specifically, we first derive the mutual coupling aware BD-RIS aided communication model using scattering and impedance parameter analysis. Based on the obtained communication model, we propose a general BD-RIS optimization algorithm applicable to different architectures of BD-RIS to maximize the channel gain. Numerical results validate the effectiveness of the proposed design and demonstrate that the larger the mutual coupling the larger the gain offered by BD-RIS over conventional diagonal RIS.

圖 · 情景 · 稀疏 · LICS · 全 ·

2024 年 5 月 23 日

First Order Logic of Sparse Graphs with Given Degree Sequences

Alberto Larrauri,Guillem Perarnau

We consider limit probabilities of first order properties in random graphs with a given degree sequence. Under mild conditions on the degree sequence, we show that the closure set of limit probabilities is a finite union of closed intervals. Moreover, we characterize the degree sequences for which this closure set is the interval $[0,1]$, a property that is intimately related with the probability that the random graph is acyclic. As a side result, we compile a full description of the cycle distribution of random graphs and study their fragment (disjoint union of unicyclic components) in the subcritical regime. Finally, we amend the proof of the existence of limit probabilities for first order properties in random graphs with a given degree sequence; this result was already claimed by Lynch~[IEEE LICS 2003] but his proof contained some inaccuracies.

Learning · 回合 · 多樣性 · INFORMS · 估計/估計量 ·

2024 年 5 月 22 日

Autonomous Algorithm for Training Autonomous Vehicles with Minimal Human Intervention

Sang-Hyun Lee,Daehyeok Kwon,Seung-Woo Seo

from arxiv, 8 pages, 6 figures, 2 tables, conference

Reinforcement learning (RL) provides a compelling framework for enabling autonomous vehicles to continue to learn and improve diverse driving behaviors on their own. However, training real-world autonomous vehicles with current RL algorithms presents several challenges. One critical challenge, often overlooked in these algorithms, is the need to reset a driving environment between every episode. While resetting an environment after each episode is trivial in simulated settings, it demands significant human intervention in the real world. In this paper, we introduce a novel autonomous algorithm that allows off-the-shelf RL algorithms to train an autonomous vehicle with minimal human intervention. Our algorithm takes into account the learning progress of the autonomous vehicle to determine when to abort episodes before it enters unsafe states and where to reset it for subsequent episodes in order to gather informative transitions. The learning progress is estimated based on the novelty of both current and future states. We also take advantage of rule-based autonomous driving algorithms to safely reset an autonomous vehicle to an initial state. We evaluate our algorithm against baselines on diverse urban driving tasks. The experimental results show that our algorithm is task-agnostic and achieves better driving performance with fewer manual resets than baselines.

監督 · Attention · Taxonomy · Learning · 有向 ·

2022 年 7 月 4 日

A Survey on Label-efficient Deep Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

Wei Shen,Zelin Peng,Xuehui Wang,Huayu Wang,Jiazhong Cen,Dongsheng Jiang,Lingxi Xie,Xiaokang Yang,Qi Tian

The rapid development of deep learning has made a great progress in segmentation, one of the fundamental tasks of computer vision. However, the current segmentation algorithms mostly rely on the availability of pixel-level annotations, which are often expensive, tedious, and laborious. To alleviate this burden, the past years have witnessed an increasing attention in building label-efficient, deep-learning-based segmentation algorithms. This paper offers a comprehensive review on label-efficient segmentation methods. To this end, we first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels (including no supervision, coarse supervision, incomplete supervision and noisy supervision) and supplemented by the types of segmentation problems (including semantic segmentation, instance segmentation and panoptic segmentation). Next, we summarize the existing label-efficient segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, cross-image relation, etc. Finally, we share our opinions about the future research directions for label-efficient deep segmentation.

優化器 · 圖 · 圖形處理器 · Neural Networks · 核化 ·

2021 年 1 月 28 日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Meiqi Zhu,Xiao Wang,Chuan Shi,Houye Ji,Peng Cui

from arxiv, WWW2021, 12 pages

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.

MoDELS · 注意力機制 · RNN · 標注 · Networking ·

2017 年 12 月 20 日

Order-Free RNN with Visual Attention for Multi-Label Classification

Shang-Fu Chen,Yi-Chen Chen,Chih-Kuan Yeh,Yu-Chiang Frank Wang

from arxiv, Accepted at 32nd AAAI Conference on Artificial Intelligence (AAAI-18)

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.