宁毅静平公主小说免费阅读-亚洲国产一区二区精品91

Discovering new intents is of great significance to establishing Bootstrapped Task-Oriented Dialogue System. Most existing methods either lack the ability to transfer prior knowledge in the known intent data or fall into the dilemma of forgetting prior knowledge in the follow-up. More importantly, these methods do not deeply explore the intrinsic structure of unlabeled data, so they can not seek out the characteristics that make an intent in general. In this paper, starting from the intuition that discovering intents could be beneficial to the identification of the known intents, we propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables. We adopt Expectation Maximization framework for optimization. Specifically, In E-step, we conduct discovering intents and explore the intrinsic structure of unlabeled data by the posterior of intent assignments. In M-step, we alleviate the forgetting of prior knowledge transferred from known intents by optimizing the discrimination of labeled data. Extensive experiments conducted in three challenging real-world datasets demonstrate our method can achieve substantial improvements.

相關內容

潛變量/隱變量

關注 0

優化器 · 層 · 自編碼器 · 變分自編碼 · Learning ·

2022 年 12 月 5 日

Bi-Level Optimization Augmented with Conditional Variational Autoencoder for Autonomous Driving in Dense Traffic

Arun Kumar Singh,Jatan Shrestha,Nicola Albarella

Autonomous driving has a natural bi-level structure. The goal of the upper behavioural layer is to provide appropriate lane change, speeding up, and braking decisions to optimize a given driving task. However, this layer can only indirectly influence the driving efficiency through the lower-level trajectory planner, which takes in the behavioural inputs to produce motion commands. Existing sampling-based approaches do not fully exploit the strong coupling between the behavioural and planning layer. On the other hand, end-to-end Reinforcement Learning (RL) can learn a behavioural layer while incorporating feedback from the lower-level planner. However, purely data-driven approaches often fail in safety metrics in unseen environments. This paper presents a novel alternative; a parameterized bi-level optimization that jointly computes the optimal behavioural decisions and the resulting downstream trajectory. Our approach runs in real-time using a custom GPU-accelerated batch optimizer, and a Conditional Variational Autoencoder learnt warm-start strategy. Extensive simulations show that our approach outperforms state-of-the-art model predictive control and RL approaches in terms of collision rate while being competitive in driving efficiency.

估計/估計量 · 潛在 · MoDELS · 潛變量/隱變量 · Cognition ·

2022 年 12 月 5 日

Longitudinal modeling of age-dependent latent traits with generalized additive latent and mixed models

?ystein S?rensen,Anders M. Fjell,Kristine B. Walhovd

We present generalized additive latent and mixed models (GALAMMs) for analysis of clustered data with responses and latent variables depending smoothly on observed variables. A scalable maximum likelihood estimation algorithm is proposed, utilizing the Laplace approximation, sparse matrix computation, and automatic differentiation. Mixed response types, heteroscedasticity, and crossed random effects are naturally incorporated into the framework. The models developed were motivated by applications in cognitive neuroscience, and two case studies are presented. First, we show how GALAMMs can jointly model the complex lifespan trajectories of episodic memory, working memory, and speed/executive function, measured by the California Verbal Learning Test (CVLT), digit span tests, and Stroop tests, respectively. Next, we study the effect of socioeconomic status on brain structure, using data on education and income together with hippocampal volumes estimated by magnetic resonance imaging. By combining semiparametric estimation with latent variable modeling, GALAMMs allow a more realistic representation of how brain and cognition vary across the lifespan, while simultaneously estimating latent traits from measured items. Simulation experiments suggest that model estimates are accurate even with moderate sample sizes.

分解的 · 近似 · 估計/估計量 · Analysis · MoDELS ·

2022 年 12 月 4 日

Approximate Factor Models with Weaker Loadings

Jushan Bai,Serena Ng

Pervasive cross-section dependence is increasingly recognized as a characteristic of economic data and the approximate factor model provides a useful framework for analysis. Assuming a strong factor structure where $\Lop\Lo/N^\alpha$ is positive definite in the limit when $\alpha=1$, early work established convergence of the principal component estimates of the factors and loadings up to a rotation matrix. This paper shows that the estimates are still consistent and asymptotically normal when $\alpha\in(0,1]$ albeit at slower rates and under additional assumptions on the sample size. The results hold whether $\alpha$ is constant or varies across factors. The framework developed for heterogeneous loadings and the simplified proofs that can be also used in strong analysis are of independent interest

任務對話系統 · 控制器 · 分解的 · 數據集 · 可辨認的 ·

2022 年 12 月 4 日

Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation

Zhexin Zhang,Jiale Cheng,Hao Sun,Jiawen Deng,Fei Mi,Yasheng Wang,Lifeng Shang,Minlie Huang

from arxiv, Findings of EMNLP 2022

Large pretrained language models can easily produce toxic or biased content, which is prohibitive for practical use. In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations. However, what type of context is more likely to induce unsafe responses is still under-explored. In this paper, we identify that context toxicity and context category (e.g., \textit{profanity}, \textit{insult}, \textit{drugs}, etc.) are two important factors to cause safety issues in response generation. Hence, we propose a method called \emph{reverse generation} to construct adversarial contexts conditioned on a given response, with the flexibility to control category, toxicity level, and inductivity of the generated contexts. Via reverse generation, we augment the existing BAD dataset and construct a new dataset BAD+ which contains more than 120K diverse and highly inductive contexts in 12 categories. We test three popular pretrained dialogue models (Blender, DialoGPT, and Plato2) and find that BAD+ can largely expose their safety problems. Furthermore, we show that BAD+ can greatly enhance the safety of generation and reveal the key factors of safety improvement. Our code and dataset is available at \url{//github.com/thu-coai/Reverse_Generation}.

估計/估計量 · 結構方程模型(Structural Equation Modeling) · 推斷 · MoDELS · 可辨認的 ·

2022 年 12 月 2 日

Deep Counterfactual Estimation with Categorical Background Variables

Edward De Brouwer

from arxiv, Proceedings of the NeurIPS 2022 Conference

Referred to as the third rung of the causal inference ladder, counterfactual queries typically ask the "What if ?" question retrospectively. The standard approach to estimate counterfactuals resides in using a structural equation model that accurately reflects the underlying data generating process. However, such models are seldom available in practice and one usually wishes to infer them from observational data alone. Unfortunately, the correct structural equation model is in general not identifiable from the observed factual distribution. Nevertheless, in this work, we show that under the assumption that the main latent contributors to the treatment responses are categorical, the counterfactuals can be still reliably predicted. Building upon this assumption, we introduce CounterFactual Query Prediction (CFQP), a novel method to infer counterfactuals from continuous observations when the background variables are categorical. We show that our method significantly outperforms previously available deep-learning-based counterfactual methods, both theoretically and empirically on time series and image data. Our code is available at //github.com/edebrouwer/cfqp.

Analysis · 特征空間 · 簇 · 可辨認的 · 路徑 ·

2022 年 12 月 2 日

Clustering through Feature Space Sequence Discovery and Analysis

Shi Guobin

from arxiv, 19pages, 20figures, individual work public individually, all benchmark and experiment were design and processed with Python 3.7. And I imagine the results have prefound meanning, I think density may be a hidden dimension

Identifying high-dimensional data patterns without a priori knowledge is an important task of data science. This paper proposes a simple and efficient noparametric algorithm: Data Convert to Sequence Analysis, DCSA, which dynamically explore each point in the feature space without repetition, and a Directed Hamilton Path will be found. Based on the change point analysis theory, The sequence corresponding to the path is cut into several fragments to achieve clustering. The experiments on real-world datasets from different fields with dimensions ranging from 4 to 20531 confirm that the method in this work is robust and has visual interpretability in result analysis.

相互獨立的 · 條件獨立的 · Performer · Analysis · 性能度量 ·

2022 年 12 月 1 日

Towards Dynamic Causal Discovery with Rare Events: A Nonparametric Conditional Independence Test

Chih-Yuan Chiu,Kshitij Kulkarni,Shankar Sastry

Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.

ASSETS · Processing（編程語言） · 比特幣 (Bitcoin) · 區塊鏈 · Learning ·

2022 年 11 月 28 日

Predicting Digital Asset Prices using Natural Language Processing: a survey

Trang Tran

Blockchain technology has changed how people think about how they used to store and trade their assets, as it introduced us to a whole new way to transact: using digital currencies. One of the major innovations of blockchain technology is decentralization, meaning that traditional financial intermediaries, such as asset-backed security issuers and banks, are eliminated in the process. Even though blockchain technology has been utilized in a wide range of industries, its most prominent application is still cryptocurrencies, with Bitcoin being the first proposed. At its peak in 2021, the market cap for Bitcoin once surpassed 1 trillion US dollars. The open nature of the crypto market poses various challenges and concerns for both potential retail investors and institutional investors, as the price of the investment is highly volatile, and its fluctuations are unpredictable. The rise of Machine Learning, and Natural Language Processing, in particular, has shed some light on monitoring and predicting the price behaviors of cryptocurrencies. This paper aims to review and analyze the recent efforts in applying Machine Learning and Natural Language Processing methods to predict the prices and analyze the behaviors of digital assets such as Bitcoin and Ethereum.

泛化理論 · INFORMS · Performer · 測試樣本 · state-of-the-art ·

2021 年 3 月 29 日

Adaptive Methods for Real-World Domain Generalization

Abhimanyu Dubey,Vignesh Ramanathan,Alex Pentland,Dhruv Mahajan

from arxiv, To appear as an oral presentation in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Invariant approaches have been remarkably successful in tackling the problem of domain generalization, where the objective is to perform inference on data distributions different from those used in training. In our work, we investigate whether it is possible to leverage domain information from the unseen test samples themselves. We propose a domain-adaptive approach consisting of two steps: a) we first learn a discriminative domain embedding from unsupervised training examples, and b) use this domain embedding as supplementary information to build a domain-adaptive model, that takes both the input as well as its domain into account while making predictions. For unseen domains, our method simply uses few unlabelled test examples to construct the domain embedding. This enables adaptive classification on any unseen domain. Our approach achieves state-of-the-art performance on various domain generalization benchmarks. In addition, we introduce the first real-world, large-scale domain generalization benchmark, Geo-YFCC, containing 1.1M samples over 40 training, 7 validation, and 15 test domains, orders of magnitude larger than prior work. We show that the existing approaches either do not scale to this dataset or underperform compared to the simple baseline of training a model on the union of data from all training domains. In contrast, our approach achieves a significant improvement.

圖 · Networking · INTERACT · INFORMS · 圖形處理器 ·

2020 年 11 月 25 日

Time-Series Event Prediction with Evolutionary State Graph

Wenjie Hu,Yang Yang,Ziqiang Cheng,Carl Yang,Xiang Ren

from arxiv, A long version of EvoNet (WSDM 2021)

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.