好男人在线观看免费2019_久久综合久久香蕉网欧美_亚洲日韩中文字幕无码专区_美女把腿张开让男人桶爽了_玖玖精品视频在线_欧美视频网站在线观看_精品久久久久久影視

Recent studies have uncovered the potential of Large Language Models (LLMs) in addressing complex sequential decision-making tasks through the provision of high-level instructions. However, LLM-based agents lack specialization in tackling specific target problems, particularly in real-time dynamic environments. Additionally, deploying an LLM-based agent in practical scenarios can be both costly and time-consuming. On the other hand, reinforcement learning (RL) approaches train agents that specialize in the target task but often suffer from low sampling efficiency and high exploration costs. In this paper, we introduce a novel framework that addresses these challenges by training a smaller, specialized student RL agent using instructions from an LLM-based teacher agent. By incorporating the guidance from the teacher agent, the student agent can distill the prior knowledge of the LLM into its own model. Consequently, the student agent can be trained with significantly less data. Moreover, through further training with environment feedback, the student agent surpasses the capabilities of its teacher for completing the target task. We conducted experiments on challenging MiniGrid and Habitat environments, specifically designed for embodied AI research, to evaluate the effectiveness of our framework. The results clearly demonstrate that our approach achieves superior performance compared to strong baseline methods. Our code is available at //github.com/ZJLAB-AMMI/LLM4Teach.

相關內容

Agent

關注 16

多峰值 · 語言模型化 · 大語言模型 · MoDELS · 推斷 ·

2024 年 3 月 5 日

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models

Gen Luo,Yiyi Zhou,Yuxin Zhang,Xiawu Zheng,Xiaoshuai Sun,Rongrong Ji

Despite remarkable progress, existing multimodal large language models (MLLMs) are still inferior in granular visual recognition. Contrary to previous works, we study this problem from the perspective of image resolution, and reveal that a combination of low- and high-resolution visual features can effectively mitigate this shortcoming. Based on this observation, we propose a novel and efficient method for MLLMs, termed Mixture-of-Resolution Adaptation (MRA). In particular, MRA adopts two visual pathways for images with different resolutions, where high-resolution visual information is embedded into the low-resolution pathway via the novel mixture-of-resolution adapters (MR-Adapters). This design also greatly reduces the input sequence length of MLLMs. To validate MRA, we apply it to a recent MLLM called LLaVA, and term the new model LLaVA-HR. We conduct extensive experiments on 11 vision-language (VL) tasks, which show that LLaVA-HR outperforms existing MLLMs on 8 VL tasks, e.g., +9.4% on TextVQA. More importantly, both training and inference of LLaVA-HR remain efficient with MRA, e.g., 20 training hours and 3$\times$ inference speed than LLaVA-1.5. Source codes are released at: //github.com/luogen1996/LLaVA-HR.

估計/估計量 · 優化器 · Learning · 可辨認的 · 強化學習 ·

2024 年 3 月 5 日

Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning

Jing Zhang,Chi Zhang,Wenjia Wang,Bing-Yi Jing

Due to the inability to interact with the environment, offline reinforcement learning (RL) methods face the challenge of estimating the Out-of-Distribution (OOD) points. Existing methods for addressing this issue either control policy to exclude the OOD action or make the $Q$ function pessimistic. However, these methods can be overly conservative or fail to identify OOD areas accurately. To overcome this problem, we propose a Constrained Policy optimization with Explicit Behavior density (CPED) method that utilizes a flow-GAN model to explicitly estimate the density of behavior policy. By estimating the explicit density, CPED can accurately identify the safe region and enable optimization within the region, resulting in less conservative learning policies. We further provide theoretical results for both the flow-GAN estimator and performance guarantee for CPED by showing that CPED can find the optimal $Q$-function value. Empirically, CPED outperforms existing alternatives on various standard offline reinforcement learning tasks, yielding higher expected returns.

contrastive · Learning · 圖 · 遷移學習 · INTERACT ·

2024 年 3 月 5 日

Zero-Shot Cross-Lingual Document-Level Event Causality Identification with Heterogeneous Graph Contrastive Transfer Learning

Zhitao He,Pengfei Cao,Yubo Chen,Kang Liu,Zhiqiang Zhang,Mengshu Sun,Jun Zhao

from arxiv, Accepted at LREC-COLING 2024

Event Causality Identification (ECI) refers to detect causal relations between events in texts. However, most existing studies focus on sentence-level ECI with high-resource language, leaving more challenging document-level ECI (DECI) with low-resource languages under-explored. In this paper, we propose a Heterogeneous Graph Interaction Model with Multi-granularity Contrastive Transfer Learning (GIMC) for zero-shot cross-lingual document-level ECI. Specifically, we introduce a heterogeneous graph interaction network to model the long-distance dependencies between events that are scattered over document. Then, to improve cross-lingual transferability of causal knowledge learned from source language, we propose a multi-granularity contrastive transfer learning module to align the causal representations across languages. Extensive experiments show our framework outperforms previous state-of-the-art model by 9.4% and 8.2% of average F1 score on monolingual and multilingual scenarios respectively. Notably, in multilingual scenario, our zero-shot framework even exceeds GPT-3.5 with few-shot learning by 24.3% in overall performance.

有限差分 · 振蕩 · 模型評估 · 查準率/準確率 · 確切的 ·

2024 年 3 月 2 日

Efficient Alternative Finite Difference WENO Schemes for Hyperbolic Systems with Non-Conservative Products

Dinshaw S. Balsara,Deepak Bhoriya,Chi-Wang Shu,Harish Kumar

from arxiv, Accepted in Communications on Applied Mathematics and Computation

Higher order finite difference Weighted Essentially Non-Oscillatory (WENO) schemes for conservation laws represent a technology that has been reasonably consolidated. They are extremely popular because, when applied to multidimensional problems, they offer high order accuracy at a fraction of the cost of finite volume WENO or DG schemes. They come in two flavors. There is the classical finite difference WENO (FD-WENO) method (Shu and Osher, J. Comput. Phys., 83 (1989) 32-78). However, in recent years there is also an alternative finite difference WENO (AFD-WENO) method which has recently been formalized into a very useful general-purpose algorithm for conservation laws (Balsara et al., Efficient Alternative Finite Difference WENO Schemes for Hyperbolic Conservation Laws, submitted to CAMC (2023)). However, the FD-WENO algorithm has only very recently been formulated for hyperbolic systems with non-conservative products (Balsara et al., Efficient Finite Difference WENO Scheme for Hyperbolic Systems with Non-Conservative Products, to appear CAMC (2023)). In this paper we show that there are substantial advantages in obtaining an AFD-WENO algorithm for hyperbolic systems with non-conservative products. Such an algorithm is documented in this paper. We present an AFD-WENO formulation in fluctuation form that is carefully engineered to retrieve the flux form when that is warranted and nevertheless extends to non-conservative products. The method is flexible because it allows any Riemann solver to be used. The formulation we arrive at is such that when non-conservative products are absent it reverts exactly to the formulation in the second citation above which is in exact flux conservation form. The ability to transition to a precise conservation form when non-conservative products are absent ensures, via the Lax-Wendroff theorem, that shock locations will be exactly ...

Learning · 規范化的 · 異常檢測 · Prompt · MoDELS ·

2024 年 3 月 2 日

Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection

Chenchen Tao,Chong Wang,Yuexian Zou,Xiaohao Peng,Jiafei Wu,Jiangbo Qian

Most models for weakly supervised video anomaly detection (WS-VAD) rely on multiple instance learning, aiming to distinguish normal and abnormal snippets without specifying the type of anomaly. The ambiguous nature of anomaly definitions across contexts introduces bias in detecting abnormal and normal snippets within the abnormal bag. Taking the first step to show the model why it is anomalous, a novel framework is proposed to guide the learning of suspected anomalies from event prompts. Given a textual prompt dictionary of potential anomaly events and the captions generated from anomaly videos, the semantic anomaly similarity between them could be calculated to identify the suspected anomalous events for each video snippet. It enables a new multi-prompt learning process to constrain the visual-semantic features across all videos, as well as provides a new way to label pseudo anomalies for self-training. To demonstrate effectiveness, comprehensive experiments and detailed ablation studies are conducted on four datasets, namely XD-Violence, UCF-Crime, TAD, and ShanghaiTech. Our proposed model outperforms most state-of-the-art methods in terms of AP or AUC (82.6\%, 87.7\%, 93.1\%, and 97.4\%). Furthermore, it shows promising performance in open-set and cross-dataset cases.

MoDELS · 回合 · Automator · 可辨認的 · Analysis ·

2024 年 3 月 1 日

A Conceptual Model for Data Storytelling Highlights in Business Intelligence Environments

Panos Vassiliadis,Patrick Marcel,Faten El Outa,Veronika Peralta,Dimos Gkitsakis

from arxiv, 16 pages, 4 figures

We introduce a conceptual model for highlights to support data analysis and storytelling in the domain of Business Intelligence, via the automated extraction, representation, and exploitation of highlights revealing key facts that are hidden in the data with which a data analyst works. The model builds on the concepts of Holistic and Elementary Highlights, along with their context, constituents and interrelationships, whose synergy can identify internal properties, patterns and key facts in a dataset being analyzed.

Performer · 學成 · Boosting（一種模型訓練加速方式） · MoDELS · 可辨認的 ·

2021 年 12 月 22 日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Lin Yang,Yi Shen,Yue Mao,Longjun Cai

from arxiv, Accepted by AAAI-2022

Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

圖 · 鏈路預測 · 正交 · 知識圖譜 · Better ·

2020 年 4 月 15 日

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Yun Tang,Jing Huang,Guangtao Wang,Xiaodong He,Bowen Zhou

from arxiv, Accepted by ACL 2020

Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE. However, N-1, 1-N and N-N predictions still remain challenging. In this work, we propose a novel translational distance-based approach for knowledge graph link prediction. The proposed method includes two-folds, first we extend the RotatE from 2D complex domain to high dimension space with orthogonal transforms to model relations for better modeling capacity. Second, the graph context is explicitly modeled via two directed context representations. These context representations are used as part of the distance scoring function to measure the plausibility of the triples during training and inference. The proposed approach effectively improves prediction accuracy on the difficult N-1, 1-N and N-N cases for knowledge graph link prediction task. The experimental results show that it achieves better performance on two benchmark data sets compared to the baseline RotatE, especially on data set (FB15k-237) with many high in-degree connection nodes.

圖片分類 · 生成式對抗網絡 · Networking · 未標記 · GANs ·

2018 年 2 月 10 日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Zilong Zhong,Jonathan Li

from arxiv, Accepted by AAAI-18

High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.