国产白浆一区二区无码视频在线_中文字幕日韩欧美爆乳在线不卡_午夜小视频免费试看_免费一区二区三区波多野结衣_操操操五月天婷婷丁香影院_人人上人人爱人人操_五月综合激情久久婷婷

In this study, we present aLLM4TS, an innovative framework that adapts Large Language Models (LLMs) for time-series representation learning. Central to our approach is that we reconceive time-series forecasting as a self-supervised, multi-patch prediction task, which, compared to traditional mask-and-reconstruction methods, captures temporal dynamics in patch representations more effectively. Our strategy encompasses two-stage training: (i). a causal continual pre-training phase on various time-series datasets, anchored on next patch prediction, effectively syncing LLM capabilities with the intricacies of time-series data; (ii). fine-tuning for multi-patch prediction in the targeted time-series context. A distinctive element of our framework is the patch-wise decoding layer, which departs from previous methods reliant on sequence-level decoding. Such a design directly transposes individual patches into temporal sequences, thereby significantly bolstering the model's proficiency in mastering temporal patch-based representations. aLLM4TS demonstrates superior performance in several downstream tasks, proving its effectiveness in deriving temporal representations with enhanced transferability and marking a pivotal advancement in the adaptation of LLMs for time-series analysis.

相關內容

大語言模型

關注 56

大語(yu)(yu)(yu)言模(mo)型(xing)(xing)是基于(yu)海量文本(ben)數據訓練的(de)深度學習模(mo)型(xing)(xing)。它不(bu)僅能夠(gou)生(sheng)(sheng)成自然語(yu)(yu)(yu)言文本(ben)，還能夠(gou)深入(ru)(ru)理解(jie)文本(ben)含義(yi)，處理各種自然語(yu)(yu)(yu)言任務(wu)，如(ru)文本(ben)摘(zhai)要、問(wen)答、翻(fan)譯等。2023年，大語(yu)(yu)(yu)言模(mo)型(xing)(xing)及其在人(ren)(ren)(ren)工智(zhi)能領(ling)域的(de)應(ying)用(yong)已成為(wei)全球科技研究的(de)熱(re)點，其在規(gui)模(mo)上的(de)增長尤為(wei)引(yin)人(ren)(ren)(ren)注(zhu)目，參(can)數量已從(cong)最(zui)初的(de)十幾(ji)億躍(yue)升(sheng)到如(ru)今(jin)的(de)一(yi)萬億。參(can)數量的(de)提升(sheng)使(shi)得模(mo)型(xing)(xing)能夠(gou)更加精細(xi)地(di)捕(bu)捉人(ren)(ren)(ren)類語(yu)(yu)(yu)言微(wei)妙(miao)之處，更加深入(ru)(ru)地(di)理解(jie)人(ren)(ren)(ren)類語(yu)(yu)(yu)言的(de)復雜性。在過去的(de)一(yi)年里，大語(yu)(yu)(yu)言模(mo)型(xing)(xing)在吸(xi)納新知識、分解(jie)復雜任務(wu)以(yi)及圖文對(dui)齊等多方面(mian)都有顯(xian)著提升(sheng)。隨著技術的(de)不(bu)斷成熟，它將不(bu)斷拓展其應(ying)用(yong)范圍，為(wei)人(ren)(ren)(ren)類提供更加智(zhi)能化(hua)和個性化(hua)的(de)服(fu)務(wu)，進一(yi)步改善人(ren)(ren)(ren)們的(de)生(sheng)(sheng)活(huo)和生(sheng)(sheng)產方式。

圖 · 樹庫 · 講稿 · TOOLS · Analysis ·

2024 年 3 月 20 日

eRST: A Signaled Graph Theory of Discourse Relations and Organization

Amir Zeldes,Tatsuya Aoyama,Yang Janet Liu,Siyao Peng,Debopam Das,Luke Gessler

In this article we present Enhanced Rhetorical Structure Theory (eRST), a new theoretical framework for computational discourse analysis, based on an expansion of Rhetorical Structure Theory (RST). The framework encompasses discourse relation graphs with tree-breaking, nonprojective and concurrent relations, as well as implicit and explicit signals which give explainable rationales to our analyses. We survey shortcomings of RST and other existing frameworks, such as Segmented Discourse Representation Theory (SDRT), the Penn Discourse Treebank (PDTB) and Discourse Dependencies, and address these using constructs in the proposed theory. We provide annotation, search and visualization tools for data, and present and evaluate a freely available corpus of English annotated according to our framework, encompassing 12 spoken and written genres with over 200K tokens. Finally, we discuss automatic parsing, evaluation metrics and applications for data in our framework.

Performer · Learning · MoDELS · Extensibility · state-of-the-art ·

2024 年 3 月 19 日

Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models

Ce Zhang,Simon Stepputtis,Katia Sycara,Yaqi Xie

Recently, large-scale pre-trained Vision-Language Models (VLMs) have demonstrated great potential in learning open-world visual representations, and exhibit remarkable performance across a wide range of downstream tasks through efficient fine-tuning. In this work, we innovatively introduce the concept of dual learning into fine-tuning VLMs, i.e., we not only learn what an image is, but also what an image isn't. Building on this concept, we introduce a novel DualAdapter approach to enable dual-path adaptation of VLMs from both positive and negative perspectives with only limited annotated samples. In the inference stage, our DualAdapter performs unified predictions by simultaneously conducting complementary positive selection and negative exclusion across target classes, thereby enhancing the overall recognition accuracy of VLMs in downstream tasks. Our extensive experimental results across 15 datasets validate that the proposed DualAdapter outperforms existing state-of-the-art methods on both few-shot learning and domain generalization tasks while achieving competitive computational efficiency. Code is available at //github.com/zhangce01/DualAdapter.

INTERACT · MoDELS · 語言模型化 · Learning · 可約的 ·

2024 年 3 月 19 日

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

Yangyi Chen,Karan Sikka,Michael Cogswell,Heng Ji,Ajay Divakaran

from arxiv, CVPR 2024. The feedback datasets are released at: //huggingface.co/datasets/YangyiYY/LVLM_NLF

We present DRESS, a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models to enhance its alignment and interactions by addressing two key limitations in the state-of-the-art LVLMs. First, prior LVLMs generally rely only on the instruction finetuning stage to enhance alignment with human preferences. Without incorporating extra feedback, they are still prone to generate unhelpful, hallucinated, or harmful responses. Second, while the visual instruction tuning data is generally structured in a multi-turn dialogue format, the connections and dependencies among consecutive conversational turns are weak. This reduces the capacity for effective multi-turn interactions. To tackle these, we propose a novel categorization of the NLF into two key types: critique and refinement. The critique NLF identifies the strengths and weaknesses of the responses and is used to align the LVLMs with human preferences. The refinement NLF offers concrete suggestions for improvement and is adopted to improve the interaction ability of the LVLMs-- which focuses on LVLMs' ability to refine responses by incorporating feedback in multi-turn interactions. To address the non-differentiable nature of NLF, we generalize conditional reinforcement learning for training. Our experimental results demonstrate that DRESS can generate more helpful (9.76%), honest (11.52%), and harmless (21.03%) responses, and more effectively learn from feedback during multi-turn interactions compared to SOTA LVMLs.

Agent · Performer · Processing（編程語言） · 大語言模型 · MoDELS ·

2024 年 3 月 19 日

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning

Fucai Ke,Zhixi Cai,Simindokht Jahangard,Weiqing Wang,Pari Delir Haghighi,Hamid Rezatofighi

Recent advances in visual reasoning (VR), particularly with the aid of Large Vision-Language Models (VLMs), show promise but require access to large-scale datasets and face challenges such as high computational costs and limited generalization capabilities. Compositional visual reasoning approaches have emerged as effective strategies; however, they heavily rely on the commonsense knowledge encoded in Large Language Models (LLMs) to perform planning, reasoning, or both, without considering the effect of their decisions on the visual reasoning process, which can lead to errors or failed procedures. To address these challenges, we introduce HYDRA, a multi-stage dynamic compositional visual reasoning framework designed for reliable and incrementally progressive general reasoning. HYDRA integrates three essential modules: a planner, a Reinforcement Learning (RL) agent serving as a cognitive controller, and a reasoner. The planner and reasoner modules utilize an LLM to generate instruction samples and executable code from the selected instruction, respectively, while the RL agent dynamically interacts with these modules, making high-level decisions on selection of the best instruction sample given information from the historical state stored through a feedback loop. This adaptable design enables HYDRA to adjust its actions based on previous feedback received during the reasoning process, leading to more reliable reasoning outputs and ultimately enhancing its overall effectiveness. Our framework demonstrates state-of-the-art performance in various VR tasks on four different widely-used datasets.

Prompt · Segment Anything · Performance · MoDELS · INTERACT ·

2024 年 3 月 19 日

SAMAug: Point Prompt Augmentation for Segment Anything Model

Haixing Dai,Chong Ma,Zhiling Yan,Zhengliang Liu,Enze Shi,Yiwei Li,Peng Shu,Xiaozheng Wei,Lin Zhao,Zihao Wu,Fang Zeng,Dajiang Zhu,Wei Liu,Quanzheng Li,Lichao Sun,Shu Zhang Tianming Liu,Xiang Li

This paper introduces SAMAug, a novel visual point augmentation method for the Segment Anything Model (SAM) that enhances interactive image segmentation performance. SAMAug generates augmented point prompts to provide more information about the user's intention to SAM. Starting with an initial point prompt, SAM produces an initial mask, which is then fed into our proposed SAMAug to generate augmented point prompts. By incorporating these extra points, SAM can generate augmented segmentation masks based on both the augmented point prompts and the initial prompt, resulting in improved segmentation performance. We conducted evaluations using four different point augmentation strategies: random sampling, sampling based on maximum difference entropy, maximum distance, and saliency. Experiment results on the COCO, Fundus, COVID QUEx, and ISIC2018 datasets show that SAMAug can boost SAM's segmentation results, especially using the maximum distance and saliency. SAMAug demonstrates the potential of visual prompt augmentation for computer vision. Codes of SAMAug are available at github.com/yhydhx/SAMAug

Analysis · 單峰值 · 估計/估計量 · 運動行為分析 · 值域 ·

2024 年 3 月 18 日

Unimodal Multi-Task Fusion for Emotional Mimicry Prediciton

Tobias Hallmen,Fabian Deuser,Norbert Oswald,Elisabeth André

In this study, we propose a methodology for the Emotional Mimicry Intensity (EMI) Estimation task within the context of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild. Our approach leverages the Wav2Vec 2.0 framework, pre-trained on a comprehensive podcast dataset, to extract a broad range of audio features encompassing both linguistic and paralinguistic elements. We enhance feature representation through a fusion technique that integrates individual features with a global mean vector, introducing global contextual insights into our analysis. Additionally, we incorporate a pre-trained valence- arousal-dominance (VAD) module from the Wav2Vec 2.0 model. Our fusion employs a Long Short-Term Memory (LSTM) architecture for efficient temporal analysis of audio data. Utilizing only the provided audio data, our approach demonstrates significant improvements over the established baseline.

變換 · state-of-the-art · 3D · 歸納偏好 · HTTPS ·

2024 年 3 月 18 日

VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation

Weiyao Wang,Yutian Lei,Gregory D. Hage,Liangjun Zhang

In this work, we introduce the Virtual In-Hand Eye Transformer (VIHE), a novel method designed to enhance 3D manipulation capabilities through action-aware view rendering. VIHE autoregressively refines actions in multiple stages by conditioning on rendered views posed from action predictions in the earlier stages. These virtual in-hand views provide a strong inductive bias for effectively recognizing the correct pose for the hand, especially for challenging high-precision tasks such as peg insertion. On 18 manipulation tasks in RLBench simulated environments, VIHE achieves a new state-of-the-art, with a 12% absolute improvement, increasing from 65% to 77% over the existing state-of-the-art model using 100 demonstrations per task. In real-world scenarios, VIHE can learn manipulation tasks with just a handful of demonstrations, highlighting its practical utility. Videos and code implementation can be found at our project site: //vihe-3d.github.io.

優化器 · Java · Integration · 代碼 · Readability ·

2024 年 3 月 17 日

Pattern-Based Peephole Optimizations with Java JIT Tests

Zhiqiang Zang,Aditya Thimmaiah,Milos Gligoric

from arxiv, 12 pages, 9 figures, 3 tables, published in ISSTA 2023 (Research Papers track)

We present JOG, a framework that facilitates developing Java JIT peephole optimizations alongside JIT tests. JOG enables developers to write a pattern, in Java itself, that specifies desired code transformations by writing code before and after the optimization, as well as any necessary preconditions. Such patterns can be written in the same way that tests of the optimization are already written in OpenJDK. JOG translates each pattern into C/C++ code that can be integrated as a JIT optimization pass. JOG also generates Java tests for optimizations from patterns. Furthermore, JOG can automatically detect possible shadow relation between a pair of optimizations where the effect of the shadowed optimization is overridden by another. Our evaluation shows that JOG makes it easier to write readable JIT optimizations alongside tests without decreasing the effectiveness of JIT optimizations. We wrote 162 patterns, including 68 existing optimizations in OpenJDK, 92 new optimizations adapted from LLVM, and two new optimizations that we proposed. We opened eight pull requests (PRs) for OpenJDK, including six for new optimizations, one on removing shadowed optimizations, and one for newly generated JIT tests; seven PRs have already been integrated into the master branch of OpenJDK.

大語言模型 · MoDELS · 語言模型化 · 命名實體識別 · Extensibility ·

2024 年 3 月 15 日

Redefining Developer Assistance: Through Large Language Models in Software Ecosystem

Somnath Banerjee,Avik Dutta,Sayan Layek,Amruit Sahoo,Sam Conrad Joyce,Rima Hazra

from arxiv, Under review

In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.

無監督 · 表示學習 · 學成 · CASES · state-of-the-art ·

2021 年 4 月 29 日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Christoph Feichtenhofer,Haoqi Fan,Bo Xiong,Ross Girshick,Kaiming He

from arxiv, CVPR 2021

We present a large-scale study on unsupervised spatiotemporal representation learning from videos. With a unified perspective on four recent image-based frameworks, we study a simple objective that can easily generalize all these methods to space-time. Our objective encourages temporally-persistent features in the same video, and in spite of its simplicity, it works surprisingly well across: (i) different unsupervised frameworks, (ii) pre-training datasets, (iii) downstream datasets, and (iv) backbone architectures. We draw a series of intriguing observations from this study, e.g., we discover that encouraging long-spanned persistency can be effective even if the timespan is 60 seconds. In addition to state-of-the-art results in multiple benchmarks, we report a few promising cases in which unsupervised pre-training can outperform its supervised counterpart. Code is made available at //github.com/facebookresearch/SlowFast