草莓视频在线观看免费完整-国色天香网站

Dialogue systems are frequently updated to accommodate new services, but naively updating them by continually training with data for new services in diminishing performance on previously learnt services. Motivated by the insight that dialogue state tracking (DST), a crucial component of dialogue systems that estimates the user's goal as a conversation proceeds, is a simple natural language understanding task, we propose reformulating it as a bundle of granular example-guided question answering tasks to minimize the task shift between services and thus benefit continual learning. Our approach alleviates service-specific memorization and teaches a model to contextualize the given question and example to extract the necessary information from the conversation. We find that a model with just 60M parameters can achieve a significant boost by learning to learn from in-context examples retrieved by a retriever trained to identify turns with similar dialogue state changes. Combining our method with dialogue-level memory replay, our approach attains state of the art performance on DST continual learning metrics without relying on any complex regularization or parameter expansion methods.

相關內容

任務對話系統

關注 36

近似 · 控制器 · Continuity · 離散化 · 狀態空間 ·

2024 年 2 月 2 日

Approximate Control for Continuous-Time POMDPs

Yannick Eich,Bastian Alt,Heinz Koeppl

from arxiv, To be published in AISTATS 2024

This work proposes a decision-making framework for partially observable systems in continuous time with discrete state and action spaces. As optimal decision-making becomes intractable for large state spaces we employ approximation methods for the filtering and the control problem that scale well with an increasing number of states. Specifically, we approximate the high-dimensional filtering distribution by projecting it onto a parametric family of distributions, and integrate it into a control heuristic based on the fully observable system to obtain a scalable policy. We demonstrate the effectiveness of our approach on several partially observed systems, including queueing systems and chemical reaction networks.

RSS · 推薦系統 · INFORMS · 值域 · 有向 ·

2024 年 2 月 2 日

A Survey on Data-Centric Recommender Systems

Riwei Lai,Li Chen,Rui Chen,Chi Zhang

Recommender systems (RSs) have become an essential tool for mitigating information overload in a range of real-world applications. Recent trends in RSs have revealed a major paradigm shift, moving the spotlight from model-centric innovations to data-centric efforts (e.g., improving data quality and quantity). This evolution has given rise to the concept of data-centric recommender systems (Data-Centric RSs), marking a significant development in the field. This survey provides the first systematic overview of Data-Centric RSs, covering 1) the foundational concepts of recommendation data and Data-Centric RSs; 2) three primary issues of recommendation data; 3) recent research developed to address these issues; and 4) several potential future directions of Data-Centric RSs.

有向 · 泛函 · Better · Pair · GROUP ·

2024 年 2 月 2 日

Low Acceptance Agreement Tests via Bounded-Degree Symplectic HDXs

Yotam Dikstein,Irit Dinur,Alexander Lubotzky

from arxiv, arXiv admin note: text overlap with arXiv:2312.15325

We solve the derandomized direct product testing question in the low acceptance regime, by constructing new high dimensional expanders that have no small connected covers. We show that our complexes have swap cocycle expansion, which allows us to deduce the agreement theorem by relying on previous work. Derandomized direct product testing, also known as agreement testing, is the following problem. Let X be a family of k-element subsets of [n] and let $\{f_s:s\to\Sigma\}_{s\in X}$ be an ensemble of local functions, each defined over a subset $s\subset [n]$. Suppose that we run the following so-called agreement test: choose a random pair of sets $s_1,s_2\in X$ that intersect on $\sqrt k$ elements, and accept if $f_{s_1},f_{s_2}$ agree on the elements in $s_1\cap s_2$. We denote the success probability of this test by $Agr(\{f_s\})$. Given that $Agr(\{f_s\})=\epsilon>0$, is there a global function $G:[n]\to\Sigma$ such that $f_s = G|_s$ for a non-negligible fraction of $s\in X$ ? We construct a family X of k-subsets of $[n]$ such that $|X| = O(n)$ and such that it satisfies the low acceptance agreement theorem. Namely, $Agr (\{f_s\}) > \epsilon \; \; \longrightarrow$ there is a function $G:[n]\to\Sigma$ such that $\Pr_s[f_s\overset{0.99}{\approx} G|_s]\geq poly(\epsilon)$. A key idea is to replace the well-studied LSV complexes by symplectic high dimensional expanders (HDXs). The family X is just the k-faces of the new symplectic HDXs. The later serve our needs better since their fundamental group satisfies the congruence subgroup property, which implies that they lack small covers.

估計/估計量 · GROUP · Weight · Learning · Principle ·

2024 年 2 月 2 日

Adaptive Crowdsourcing Via Self-Supervised Learning

Anmol Kagrecha,Henrik Marklund,Benjamin Van Roy,Hong Jun Jeon,Richard Zeckhauser

from arxiv, 33 pages, 3 figures

Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate. We develop a new approach -- predict-each-worker -- that leverages self-supervised learning and a novel aggregation scheme. This approach adapts weights assigned to crowdworkers based on estimates they provided for previous quantities. When skills vary across crowdworkers or their estimates correlate, the weighted sum offers a more accurate group estimate than the average. Existing algorithms such as expectation maximization can, at least in principle, produce similarly accurate group estimates. However, their computational requirements become onerous when complex models, such as neural networks, are required to express relationships among crowdworkers. Predict-each-worker accommodates such complexity as well as many other practical challenges. We analyze the efficacy of predict-each-worker through theoretical and computational studies. Among other things, we establish asymptotic optimality as the number of engagements per crowdworker grows.

情景 · Performer · SimPLe · 值域 · 優化器 ·

2024 年 2 月 1 日

The En Route Truck-Drone Delivery Problem

Danny Krizanc,Lata Narayanan,Jaroslav Opatrny,Denis Pankratov

We study the truck-drone cooperative delivery problem in a setting where a single truck carrying a drone travels at constant speed on a straight-line trajectory/street. Delivery to clients located in the plane and not on the truck's trajectory is performed by the drone, which has limited carrying capacity and flying range, and whose battery can be recharged when on the truck. We show that the problem of maximizing the number of deliveries is strongly NP-hard even in this simple setting. We present a 2-approximation algorithm for the problem, and an optimal algorithm for a non-trivial family of instances.

SC · 設計 · 代碼 · Performer · 通用動力公司 ·

2024 年 2 月 1 日

Probabilistic Design of Multi-Dimensional Spatially-Coupled Codes

Canberk ?rima?z?,Ata Tanr?kulu,Ahmed Hareedy

from arxiv, 12 pages (double column), 5 figures, the short version has been submitted to the IEEE International Symposium on Information Theory (ISIT)

Because of their excellent asymptotic and finite-length performance, spatially-coupled (SC) codes are a class of low-density parity-check codes that is gaining increasing attention. Multi-dimensional (MD) SC codes are constructed by connecting copies of an SC code via relocations in order to mitigate various sources of non-uniformity and improve performance in many data storage and data transmission systems. As the number of degrees of freedom in the MD-SC code design increases, appropriately exploiting them becomes more difficult because of the complexity growth of the design process. In this paper, we propose a probabilistic framework for the MD-SC code design, which is based on the gradient-descent (GD) algorithm, to design better MD codes and address this challenge. In particular, we express the expected number of short cycles, which we seek to minimize, in the graph representation of the code in terms of entries of a probability-distribution matrix that characterizes the MD-SC code design. We then find a locally-optimal probability distribution, which serves as the starting point of a finite-length algorithmic optimizer that produces the final MD-SC code. We offer the theoretical analysis as well as the algorithms, and we present experimental results demonstrating that our MD codes, conveniently called GD-MD codes, have notably lower short cycle numbers compared with the available state-of-the-art. Moreover, our algorithms converge on solutions in few iterations, which confirms the complexity reduction as a result of limiting the search space via the locally-optimal GD-MD distributions.

Prompt · 大語言模型 · 有向 · MoDELS · 優化器 ·

2024 年 1 月 31 日

Prompt-Driven LLM Safeguarding via Directed Representation Optimization

Chujie Zheng,Fan Yin,Hao Zhou,Fandong Meng,Jie Zhou,Kai-Wei Chang,Minlie Huang,Nanyun Peng

Prepending model inputs with safety prompts is a common practice of safeguarding large language models (LLMs) from complying with queries that contain harmful intents. However, the working mechanisms of safety prompts have not yet been fully understood, which hinders the potential for automatically optimizing them for improved LLM safety. Motivated by this problem, we investigate the impact of safety prompts from the perspective of model representations. We find that in models' representation space, harmful and harmless queries can be largely distinguished, but this is not noticeably enhanced by safety prompts. Instead, the queries' representations are moved by different safety prompts in similar directions, where models become more prone to refusal (i.e., refusing to provide assistance) even when the queries are harmless. Inspired by these findings, we propose a method called DRO (Directed Representation Optimization) for automatic safety prompt optimization. DRO treats safety prompts as continuous, trainable embeddings and learns to move the representations of harmful/harmless queries along/opposite the direction in which the model's refusal probability increases. We demonstrate that DRO remarkably improves the safeguarding performance of human-crafted safety prompts and outperforms strong baselines, as evaluated on out-of-domain benchmarks, without compromising the general model capability.

模型評估 · Conformer · 可約的 · 有向模型 · 目標檢測 ·

2024 年 1 月 31 日

Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

Sanbao Su,Songyang Han,Yiming Li,Zhili Zhang,Chen Feng,Caiwen Ding,Fei Miao

from arxiv, This paper has been accepted by IEEE Robotics and Automation Letters

Object detection and multiple object tracking (MOT) are essential components of self-driving systems. Accurate detection and uncertainty quantification are both critical for onboard modules, such as perception, prediction, and planning, to improve the safety and robustness of autonomous vehicles. Collaborative object detection (COD) has been proposed to improve detection accuracy and reduce uncertainty by leveraging the viewpoints of multiple agents. However, little attention has been paid to how to leverage the uncertainty quantification from COD to enhance MOT performance. In this paper, as the first attempt to address this challenge, we design an uncertainty propagation framework called MOT-CUP. Our framework first quantifies the uncertainty of COD through direct modeling and conformal prediction, and propagates this uncertainty information into the motion prediction and association steps. MOT-CUP is designed to work with different collaborative object detectors and baseline MOT algorithms. We evaluate MOT-CUP on V2X-Sim, a comprehensive collaborative perception dataset, and demonstrate a 2% improvement in accuracy and a 2.67X reduction in uncertainty compared to the baselines, e.g. SORT and ByteTrack. In scenarios characterized by high occlusion levels, our MOT-CUP demonstrates a noteworthy $4.01\%$ improvement in accuracy. MOT-CUP demonstrates the importance of uncertainty quantification in both COD and MOT, and provides the first attempt to improve the accuracy and reduce the uncertainty in MOT based on COD through uncertainty propagation. Our code is public on //coperception.github.io/MOT-CUP/.

MoDELS · Learning · Performer · Guidance · Nuance ·

2024 年 1 月 31 日

Instruction-Guided Scene Text Recognition

Yongkun Du,Zhineng Chen,Yuchen Su,Caiyan Jia,Yu-Gang Jiang

Multi-modal models have shown appealing performance in visual tasks recently, as instruction-guided training has evoked the ability to understand fine-grained visual content. However, current methods cannot be trivially applied to scene text recognition (STR) due to the gap between natural and text images. In this paper, we introduce a novel paradigm that formulates STR as an instruction learning problem, and propose instruction-guided scene text recognition (IGTR) to achieve effective cross-modal learning. IGTR first generates rich and diverse instruction triplets of <condition,question,answer>, serving as guidance for nuanced text image understanding. Then, we devise an architecture with dedicated cross-modal feature fusion module, and multi-task answer head to effectively fuse the required instruction and image features for answering questions. Built upon these designs, IGTR facilitates accurate text recognition by comprehending character attributes. Experiments on English and Chinese benchmarks show that IGTR outperforms existing models by significant margins. Furthermore, by adjusting the instructions, IGTR enables various recognition schemes. These include zero-shot prediction, where the model is trained based on instructions not explicitly targeting character recognition, and the recognition of rarely appearing and morphologically similar characters, which were previous challenges for existing models.

泛化理論 · Extensibility · state-of-the-art · 測試數據 · 學成 ·

2021 年 4 月 16 日

Deep Stable Learning for Out-Of-Distribution Generalization

Xingxuan Zhang,Peng Cui,Renzhe Xu,Linjun Zhou,Yue He,Zheyan Shen

Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. Therefore, eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models. Conventional methods assume either the known heterogeneity of training data (e.g. domain labels) or the approximately equal capacities of different domains. In this paper, we consider a more challenging case where neither of the above assumptions holds. We propose to address this problem by removing the dependencies between features via learning weights for training samples, which helps deep models get rid of spurious correlations and, in turn, concentrate more on the true connection between discriminative features and labels. Extensive experiments clearly demonstrate the effectiveness of our method on multiple distribution generalization benchmarks compared with state-of-the-art counterparts. Through extensive experiments on distribution generalization benchmarks including PACS, VLCS, MNIST-M, and NICO, we show the effectiveness of our method compared with state-of-the-art counterparts.