东京热加勒比中文无码,亚洲日韩网站在线观看,韩国精品免费视频一区二区

In-Context Learning (ICL) empowers Large Language Models (LLMs) with the capacity to learn in context, achieving downstream generalization without gradient updates but with a few in-context examples. Despite the encouraging empirical success, the underlying mechanism of ICL remains unclear, and existing research offers various viewpoints of understanding. These studies propose intuition-driven and ad-hoc technical solutions for interpreting ICL, illustrating an ambiguous road map. In this paper, we leverage a data generation perspective to reinterpret recent efforts and demonstrate the potential broader usage of popular technical solutions, approaching a systematic angle. For a conceptual definition, we rigorously adopt the terms of skill learning and skill recognition. The difference between them is skill learning can learn new data generation functions from in-context data. We also provide a comprehensive study on the merits and weaknesses of different solutions, and highlight the uniformity among them given the perspective of data generation, establishing a technical foundation for future research to incorporate the strengths of different lines of research.

相關內容

Learning

關注 12

Markov · MoDELS · 邊 · 真實值 · DAG ·

2024 年 9 月 30 日

Choosing DAG Models Using Markov and Minimal Edge Count in the Absence of Ground Truth

Joseph D. Ramsey,Bryan Andrews,Peter Spirtes

from arxiv, 19 pages, 14 figures, 1 table

We give a novel nonparametric pointwise consistent statistical test (the Markov Checker) of the Markov condition for directed acyclic graph (DAG) or completed partially directed acyclic graph (CPDAG) models given a dataset. We also introduce the Cross-Algorithm Frugality Search (CAFS) for rejecting DAG models that either do not pass the Markov Checker test or that are not edge minimal. Edge minimality has been used previously by Raskutti and Uhler as a nonparametric simplicity criterion, though CAFS readily generalizes to other simplicity conditions. Reference to the ground truth is not necessary for CAFS, so it is useful for finding causal structure learning algorithms and tuning parameter settings that output causal models that are approximately true from a given data set. We provide a software tool for this analysis that is suitable for even quite large or dense models, provided a suitably fast pointwise consistent test of conditional independence is available. In addition, we show in simulation that the CAFS procedure can pick approximately correct models without knowing the ground truth.

Networking · 前向 · Analysis · ATS · Extensibility ·

2024 年 9 月 29 日

CyclicSim: Comprehensive Evaluation of Cyclic Shapers in Time-Sensitive Networking

Rubi Debnath,Luxi Zhao,Mohammadreza Barzegaran,Sebastian Steinhorst

from arxiv, Accepted for publication in the proceedings of the IEEE Consumer Communications & Networking Conference (CCNC), January 2025

Cyclic Queuing and Forwarding (CQF) is a key Time-Sensitive Networking (TSN) shaping mechanism that ensures bounded latency using a simple gate control list (GCL). Recently, variants of CQF, including Cycle Specific Queuing and Forwarding (CSQF) and Multi Cyclic Queuing and Forwarding (MCQF), have emerged. While popular TSN mechanisms such as the Time-Aware Shaper (TAS), Asynchronous Traffic Shaper (ATS), Credit-Based Shaper (CBS), and Strict Priority (SP) have been extensively studied, cyclic shapers have not been thoroughly evaluated. This paper presents a comprehensive analysis of CQF, CSQF, and MCQF, providing insights into their performance. We quantify delays through simulations and quantitative analysis on both synthetic and realistic networks. For the first time, we introduce an open-source OMNeT++ and INET4.4 based framework capable of modeling all three cyclic shaper variants. Our tool facilitates the validation of new algorithms and serves as a benchmark for cyclic shapers. Our evaluations reveal that MCQF supports diverse timing requirements, whereas CSQF, with its additional queue, often results in larger delays and jitter for some TT flows compared to CQF. Additionally, CSQF does not demonstrate significant advantages in TSN networks where propagation delays are less critical than in wide-area networks (WANs).

Engineering · 語言模型化 · 論文 · 可理解性 · Performer ·

2024 年 9 月 29 日

Towards an Understanding of Large Language Models in Software Engineering Tasks

Zibin Zheng,Kaiwen Ning,Qingyuan Zhong,Jiachi Chen,Yanlin Wang,Wenqing Chen,Lianghong Guo,Weicheng Wang

Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in text generation and reasoning tasks. Derivative products, like ChatGPT, have been extensively deployed and highly sought after. Meanwhile, the evaluation and optimization of LLMs in software engineering tasks, such as code generation, have become a research focus. However, there is still a lack of systematic research on applying and evaluating LLMs in software engineering. Therefore, this paper comprehensively investigate and collate the research and products combining LLMs with software engineering, aiming to answer two questions: (1) What are the current integrations of LLMs with software engineering? (2) Can LLMs effectively handle software engineering tasks? To find the answers, we have collected related literature as extensively as possible from seven mainstream databases and selected 123 timely papers published starting from 2022 for analysis. We have categorized these papers in detail and reviewed the current research status of LLMs from the perspective of seven major software engineering tasks, hoping this will help researchers better grasp the research trends and address the issues when applying LLMs. Meanwhile, we have also organized and presented papers with evaluation content to reveal the performance and effectiveness of LLMs in various software engineering tasks, guiding researchers and developers to optimize.

圖 · 優化器 · 機器人 · Continuity · motivation ·

2024 年 9 月 29 日

Multi-Query Shortest-Path Problem in Graphs of Convex Sets

Savva Morozov,Tobia Marcucci,Alexandre Amice,Bernhard Paus Graesdal,Rohan Bosworth,Pablo A. Parrilo,Russ Tedrake

from arxiv, To appear in: The International Workshop on the Algorithmic Foundations of Robotics, WAFR 2024

The Shortest-Path Problem in Graph of Convex Sets (SPP in GCS) is a recently developed optimization framework that blends discrete and continuous decision making. Many relevant problems in robotics, such as collision-free motion planning, can be cast and solved as an SPP in GCS, yielding lower-cost solutions and faster runtimes than state-of-the-art algorithms. In this paper, we are motivated by motion planning of robot arms that must operate swiftly in static environments. We consider a multi-query extension of the SPP in GCS, where the goal is to efficiently precompute optimal paths between given sets of initial and target conditions. Our solution consists of two stages. Offline, we use semidefinite programming to compute a coarse lower bound on the problem's cost-to-go function. Then, online, this lower bound is used to incrementally generate feasible paths by solving short-horizon convex programs. For a robot arm with seven joints, our method designs higher quality trajectories up to two orders of magnitude faster than existing motion planners.

MoDELS · AI · Learning · Weight · 小樣本學習 ·

2024 年 9 月 28 日

Model X-Ray: Detection of Hidden Malware in AI Model Weights using Few Shot Learning

Daniel Gilkarov,Ran Dubin

The potential for exploitation of AI models has increased due to the rapid advancement of Artificial Intelligence (AI) and the widespread use of platforms like Model Zoo for sharing AI models. Attackers can embed malware within AI models through steganographic techniques, taking advantage of the substantial size of these models to conceal malicious data and use it for nefarious purposes, e.g. Remote Code Execution. Ensuring the security of AI models is a burgeoning area of research essential for safeguarding the multitude of organizations and users relying on AI technologies. This study leverages well-studied image few-shot learning techniques by transferring the AI models to the image field using a novel image representation. Applying few-shot learning in this field enables us to create practical models, a feat that previous works lack. Our method addresses critical limitations in state-of-the-art detection techniques that hinder their practicality. This approach reduces the required training dataset size from 40000 models to just 6. Furthermore, our methods consistently detect delicate attacks of up to 25% embedding rate and even up to 6% in some cases, while previous works were only shown to be effective for a 100%-50% embedding rate. We employ a strict evaluation strategy to ensure the trained models are generic concerning various factors. In addition, we show that our trained models successfully detect novel spread-spectrum steganography attacks, demonstrating the models' impressive robustness just by learning one type of attack. We open-source our code to support reproducibility and enhance the research in this new field.

MoDELS · CASE · 類別 · 軟件工程 ·

2024 年 9 月 27 日

Verification of Quantitative Temporal Properties in RealTime-DEVS

Ariel González,Maximiliano Cristiá,Carlos Luna

Real-Time DEVS (RT-DEVS) can model systems with quantitative temporal requirements. Ensuring that such models verify some temporal properties requires to use something beyond simulation. In this work we use the model checker Uppaal to verify a class of recurrent quantitative temporal properties appearing in RT-DEVS models. Secondly, by introducing mutations to quantitative temporal properties we are able to find errors in RT-DEVS models and their implementations. A case study from the railway domain is presented.

DC · 標注 · Learning · 層 · MoDELS ·

2024 年 9 月 27 日

HiCuLR: Hierarchical Curriculum Learning for Rhetorical Role Labeling of Legal Documents

T. Y. S. S. Santosh,Apolline Isaia,Shiyu Hong,Matthias Grabmair

from arxiv, Accepted to EMNLP 2024 Findings

Rhetorical Role Labeling (RRL) of legal documents is pivotal for various downstream tasks such as summarization, semantic case search and argument mining. Existing approaches often overlook the varying difficulty levels inherent in legal document discourse styles and rhetorical roles. In this work, we propose HiCuLR, a hierarchical curriculum learning framework for RRL. It nests two curricula: Rhetorical Role-level Curriculum (RC) on the outer layer and Document-level Curriculum (DC) on the inner layer. DC categorizes documents based on their difficulty, utilizing metrics like deviation from a standard discourse structure and exposes the model to them in an easy-to-difficult fashion. RC progressively strengthens the model to discern coarse-to-fine-grained distinctions between rhetorical roles. Our experiments on four RRL datasets demonstrate the efficacy of HiCuLR, highlighting the complementary nature of DC and RC.

環 · Legged Robot · Performer · 回合 · Continuity ·

2024 年 9 月 26 日

LoopSR: Looping Sim-and-Real for Lifelong Policy Adaptation of Legged Robots

Peilin Wu,Weiji Xie,Jiahang Cao,Hang Lai,Weinan Zhang

from arxiv, under review

Reinforcement Learning (RL) has shown its remarkable and generalizable capability in legged locomotion through sim-to-real transfer. However, while adaptive methods like domain randomization are expected to make policy more robust to diverse environments, such comprehensiveness potentially detracts from the policy's performance in any specific environment according to the No Free Lunch theorem, leading to a suboptimal solution once deployed in the real world. To address this issue, we propose a lifelong policy adaptation framework named LoopSR, which utilizes a transformer-based encoder to project real-world trajectories into a latent space, and accordingly reconstruct the real-world environments back in simulation for further improvement. Autoencoder architecture and contrastive learning methods are adopted to better extract the characteristics of real-world dynamics. The simulation parameters for continual training are derived by combining predicted parameters from the decoder with retrieved parameters from the simulation trajectory dataset. By leveraging the continual training, LoopSR achieves superior data efficiency compared with strong baselines, with only a limited amount of data to yield eminent performance in both sim-to-sim and sim-to-real experiments.

近似 · 示例 · 優化器 · SAT · Analysis ·

2024 年 9 月 26 日

On the Mysteries of MAX NAE-SAT

Joshua Brakensiek,Neng Huang,Aaron Potechin,Uri Zwick

from arxiv, 44 pages, 8 figures, accepted to SIDMA

MAX NAE-SAT is a natural optimization problem, closely related to its better-known relative MAX SAT. The approximability status of MAX NAE-SAT is almost completely understood if all clauses have the same size $k$, for some $k\ge 2$. We refer to this problem as MAX NAE-$\{k\}$-SAT. For $k=2$, it is essentially the celebrated MAX CUT problem. For $k=3$, it is related to the MAX CUT problem in graphs that can be fractionally covered by triangles. For $k\ge 4$, it is known that an approximation ratio of $1-\frac{1}{2^{k-1}}$, obtained by choosing a random assignment, is optimal, assuming $P\ne NP$. For every $k\ge 2$, an approximation ratio of at least $\frac{7}{8}$ can be obtained for MAX NAE-$\{k\}$-SAT. There was some hope, therefore, that there is also a $\frac{7}{8}$-approximation algorithm for MAX NAE-SAT, where clauses of all sizes are allowed simultaneously. Our main result is that there is no $\frac{7}{8}$-approximation algorithm for MAX NAE-SAT, assuming the unique games conjecture (UGC). In fact, even for almost satisfiable instances of MAX NAE-$\{3,5\}$-SAT (i.e., MAX NAE-SAT where all clauses have size $3$ or $5$), the best approximation ratio that can be achieved, assuming UGC, is at most $\frac{3(\sqrt{21}-4)}{2}\approx 0.8739$. Using calculus of variations, we extend the analysis of O'Donnell and Wu for MAX CUT to MAX NAE-$\{3\}$-SAT. We obtain an optimal algorithm, assuming UGC, for MAX NAE-$\{3\}$-SAT, slightly improving on previous algorithms. The approximation ratio of the new algorithm is $\approx 0.9089$. We complement our theoretical results with some experimental results. We describe an approximation algorithm for almost satisfiable instances of MAX NAE-$\{3,5\}$-SAT with a conjectured approximation ratio of 0.8728, and an approximation algorithm for almost satisfiable instances of MAX NAE-SAT with a conjectured approximation ratio of 0.8698.

卷積神經網絡 · Neural Networks · Performer · Seven · Processing（編程語言） ·

2019 年 1 月 17 日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Asifullah Khan,Anabia Sohail,Umme Zahoora,Aqsa Saeed Qureshi

from arxiv, Number of Pages: 60 Number of Figures: 11 Number of Tables:1

Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.