亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='7kbhb'><del id='7kbhb'><del id='7kbhb'></del><pre id='7kbhb'><pre id='7kbhb'><option id='7kbhb'><address id='7kbhb'></address><bdo id='7kbhb'><tr id='7kbhb'><acronym id='7kbhb'><pre id='7kbhb'></pre></acronym><div id='7kbhb'></div></tr></bdo></option></pre><small id='7kbhb'><address id='7kbhb'><u id='7kbhb'><legend id='7kbhb'><option id='7kbhb'><abbr id='7kbhb'></abbr><li id='7kbhb'><pre id='7kbhb'></pre></li></option></legend><select id='7kbhb'></select></u></address></small></pre></del><sup id='7kbhb'></sup><blockquote id='7kbhb'><dt id='7kbhb'></dt></blockquote><blockquote id='7kbhb'></blockquote></dir><tt id='7kbhb'></tt><u id='7kbhb'><tt id='7kbhb'><form id='7kbhb'></form></tt><td id='7kbhb'><dt id='7kbhb'></dt></td></u>

<code id='7kbhb'><i id='7kbhb'><q id='7kbhb'><legend id='7kbhb'><pre id='7kbhb'><style id='7kbhb'><acronym id='7kbhb'><i id='7kbhb'><form id='7kbhb'><option id='7kbhb'><center id='7kbhb'></center></option></form></i></acronym></style><tt id='7kbhb'></tt></pre></legend></q></i></code><center id='7kbhb'></center>

<dd id='7kbhb'></dd>

<style id='7kbhb'></style><sub id='7kbhb'><dfn id='7kbhb'><abbr id='7kbhb'><big id='7kbhb'><bdo id='7kbhb'></bdo></big></abbr></dfn></sub>_{<dir id='7kbhb'></dir>}

·

MoDELS · Learning · INFORMS · Performer · 相關系數 ·

2024 年 4 月 4 日

Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling

Akash Srivastava,Yamini Bansal,Yukun Ding,Cole Lincoln Hurwitz,Kai Xu,Bernhard Egger,Prasanna Sattigeri,Joshua B. Tenenbaum,Phuong Le,Arun Prakash R,Nengfeng Zhou,Joel Vaughan,Yaquan Wang,Anwesha Bhattacharyya,Kristjan Greenewald,David D. Cox,Dan Gutfreund

Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the (aggregate) posterior to encourage statistical independence of the latent factors. This approach introduces a trade-off between disentangled representation learning and reconstruction quality since the model does not have enough capacity to learn correlated latent variables that capture detail information present in most image data. To overcome this trade-off, we present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method; then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables, adding detail information while maintaining conditioning on the previously learned disentangled factors. Taken together, our multi-stage modelling approach results in a single, coherent probabilistic model that is theoretically justified by the principal of D-separation and can be realized with a variety of model classes including likelihood-based models such as variational autoencoders, implicit models such as generative adversarial networks, and tractable models like normalizing flows or mixtures of Gaussians. We demonstrate that our multi-stage model has higher reconstruction quality than current state-of-the-art methods with equivalent disentanglement performance across multiple standard benchmarks. In addition, we apply the multi-stage model to generate synthetic tabular datasets, showcasing an enhanced performance over benchmark models across a variety of metrics. The interpretability analysis further indicates that the multi-stage model can effectively uncover distinct and meaningful features of variations from which the original distribution can be recovered.

相關內容

MoDELS

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · MoDELS · 可辨認的 · 標注 · 有偏 ·

2024 年 5 月 27 日

Incremental Sequence Labeling: A Tale of Two Shifts

Shengjie Qiu,Junhao Zheng,Zhen Liu,Yicheng Luo,Qianli Ma

from arxiv, accepted to ACL 2024

The incremental sequence labeling task involves continuously learning new classes over time while retaining knowledge of the previous ones. Our investigation identifies two significant semantic shifts: E2O (where the model mislabels an old entity as a non-entity) and O2E (where the model labels a non-entity or old entity as a new entity). Previous research has predominantly focused on addressing the E2O problem, neglecting the O2E issue. This negligence results in a model bias towards classifying new data samples as belonging to the new class during the learning process. To address these challenges, we propose a novel framework, Incremental Sequential Labeling without Semantic Shifts (IS3). Motivated by the identified semantic shifts (E2O and O2E), IS3 aims to mitigate catastrophic forgetting in models. As for the E2O problem, we use knowledge distillation to maintain the model's discriminative ability for old entities. Simultaneously, to tackle the O2E problem, we alleviate the model's bias towards new entities through debiased loss and optimization levels. Our experimental evaluation, conducted on three datasets with various incremental settings, demonstrates the superior performance of IS3 compared to the previous state-of-the-art method by a significant margin.The data, code, and scripts are publicly available at //github.com/zzz47zzz/codebase-for-incremental-learning-with-llm.

Perplexity · Performer · 穩健性 · Learning · 秩 ·

2024 年 5 月 27 日

On the Noise Robustness of In-Context Learning for Text Generation

Hongfu Gao,Feipeng Zhang,Wenyu Jiang,Jun Shu,Feng Zheng,Hongxin Wei

Large language models (LLMs) have shown impressive performance on downstream tasks by in-context learning (ICL), which heavily relies on the quality of demonstrations selected from a large set of annotated examples. Recent works claim that in-context learning is robust to noisy demonstrations in text classification. In this work, we show that, on text generation tasks, noisy annotations significantly hurt the performance of in-context learning. To circumvent the issue, we propose a simple and effective approach called Local Perplexity Ranking (LPR), which replaces the "noisy" candidates with their nearest neighbors that are more likely to be clean. Our method is motivated by analyzing the perplexity deviation caused by noisy labels and decomposing perplexity into inherent perplexity and matching perplexity. Our key idea behind LPR is thus to decouple the matching perplexity by performing the ranking among the neighbors in semantic space. Our approach can prevent the selected demonstrations from including mismatched input-label pairs while preserving the effectiveness of the original selection methods. Extensive experiments demonstrate the effectiveness of LPR, improving the EM score by up to 18.75 on common benchmarks with noisy annotations.

語言模型化 · 知識 (knowledge) · Vision · MoDELS · Extensibility ·

2024 年 5 月 27 日

Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View

Jin Wang,Shichao Dong,Yapeng Zhu,Kelu Yao,Weidong Zhao,Chao Li,Ping Luo

from arxiv, 21 pages, 8 figures

Compositional reasoning capabilities are usually considered as fundamental skills to characterize human perception. Recent studies show that current Vision Language Models (VLMs) surprisingly lack sufficient knowledge with respect to such capabilities. To this end, we propose to thoroughly diagnose the composition representations encoded by VLMs, systematically revealing the potential cause for this weakness. Specifically, we propose evaluation methods from a novel game-theoretic view to assess the vulnerability of VLMs on different aspects of compositional understanding, e.g., relations and attributes. Extensive experimental results demonstrate and validate several insights to understand the incapabilities of VLMs on compositional reasoning, which provide useful and reliable guidance for future studies. The deliverables will be updated at //vlms-compositionality-gametheory.github.io/.

INFORMS · MoDELS · 近似 · 生成模型 · 數據點 ·

2024 年 5 月 25 日

Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection

Sam Dauncey,Chris Holmes,Christopher Williams,Fabian Falck

Likelihood-based deep generative models such as score-based diffusion models and variational autoencoders are state-of-the-art machine learning models approximating high-dimensional distributions of data such as images, text, or audio. One of many downstream tasks they can be naturally applied to is out-of-distribution (OOD) detection. However, seminal work by Nalisnick et al. which we reproduce showed that deep generative models consistently infer higher log-likelihoods for OOD data than data they were trained on, marking an open problem. In this work, we analyse using the gradient of a data point with respect to the parameters of the deep generative model for OOD detection, based on the simple intuition that OOD data should have larger gradient norms than training data. We formalise measuring the size of the gradient as approximating the Fisher information metric. We show that the Fisher information matrix (FIM) has large absolute diagonal values, motivating the use of chi-square distributed, layer-wise gradient norms as features. We combine these features to make a simple, model-agnostic and hyperparameter-free method for OOD detection which estimates the joint density of the layer-wise gradient norms for a given data point. We find that these layer-wise gradient norms are weakly correlated, rendering their combined usage informative, and prove that the layer-wise gradient norms satisfy the principle of (data representation) invariance. Our empirical results indicate that this method outperforms the Typicality test for most deep generative models and image dataset pairings.

變換 · 可理解性 · MoDELS · 層 · 近似 ·

2024 年 5 月 24 日

Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling

Mingze Wang,Weinan E

from arxiv, 70 pages

We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. We investigate the mechanisms through which different components of Transformer, such as the dot-product self-attention, positional encoding and feed-forward layer, affect its expressive power, and we study their combined effects through establishing explicit approximation rates. Our study reveals the roles of critical parameters in the Transformer, such as the number of layers and the number of attention heads. These theoretical insights are validated experimentally and offer natural suggestions for alternative architectures.

有偏 · Markovian · 近似 · 噪聲 · Lyapunov ·

2024 年 5 月 23 日

Computing the Bias of Constant-step Stochastic Approximation with Markovian Noise

Sebastian Allmeier,Nicolas Gast

from arxiv, Preprint

We study stochastic approximation algorithms with Markovian noise and constant step-size $\alpha$. We develop a method based on infinitesimal generator comparisons to study the bias of the algorithm, which is the expected difference between $\theta_n$ -- the value at iteration $n$ -- and $\theta^*$ -- the unique equilibrium of the corresponding ODE. We show that, under some smoothness conditions, this bias is of order $O(\alpha)$. Furthermore, we show that the time-averaged bias is equal to $\alpha V + O(\alpha^2)$, where $V$ is a constant characterized by a Lyapunov equation, showing that $\esp{\bar{\theta}_n} \approx \theta^*+V\alpha + O(\alpha^2)$, where $\bar{\theta}_n=(1/n)\sum_{k=1}^n\theta_k$ is the Polyak-Ruppert average. We also show that $\bar{\theta}_n$ converges with high probability around $\theta^*+\alpha V$. We illustrate how to combine this with Richardson-Romberg extrapolation to derive an iterative scheme with a bias of order $O(\alpha^2)$.

Perplexity · 語言模型化 · MoDELS · Performer · 講稿 ·

2024 年 5 月 22 日

Slaves to the Law of Large Numbers: An Asymptotic Equipartition Property for Perplexity in Generative Language Models

Raghu Mudumbai,Tyler Bell

We propose a new asymptotic equipartition property for the perplexity of a large piece of text generated by a language model and present theoretical arguments for this property. Perplexity, defined as a inverse likelihood function, is widely used as a performance metric for training language models. Our main result states that the logarithmic perplexity of any large text produced by a language model must asymptotically converge to the average entropy of its token distributions. This means that language models are constrained to only produce outputs from a ``typical set", which we show, is a vanishingly small subset of all possible grammatically correct outputs. We present preliminary experimental results from an open-source language model to support our theoretical claims. This work has possible practical applications for understanding and improving ``AI detection" tools and theoretical implications for the uniqueness, predictability and creative potential of generative models.

Performer · 機器學習建模 · ML · MoDELS · Processing（編程語言） ·

2024 年 5 月 22 日

A Dynamic Model of Performative Human-ML Collaboration: Theory and Empirical Evidence

Tom Sühr,Samira Samadi,Chiara Farronato

from arxiv, 9 Pages and appendix

Machine learning (ML) models are increasingly used in various applications, from recommendation systems in e-commerce to diagnosis prediction in healthcare. In this paper, we present a novel dynamic framework for thinking about the deployment of ML models in a performative, human-ML collaborative system. In our framework, the introduction of ML recommendations changes the data generating process of human decisions, which are only a proxy to the ground truth and which are then used to train future versions of the model. We show that this dynamic process in principle can converge to different stable points, i.e. where the ML model and the Human+ML system have the same performance. Some of these stable points are suboptimal with respect to the actual ground truth. We conduct an empirical user study with 1,408 participants to showcase this process. In the study, humans solve instances of the knapsack problem with the help of machine learning predictions. This is an ideal setting because we can see how ML models learn to imitate human decisions and how this learning process converges to a stable point. We find that for many levels of ML performance, humans can improve the ML predictions to dynamically reach an equilibrium performance that is around 92% of the maximum knapsack value. We also find that the equilibrium performance could be even higher if humans rationally followed the ML recommendations. Finally, we test whether monetary incentives can increase the quality of human decisions, but we fail to find any positive effect. Our results have practical implications for the deployment of ML models in contexts where human decisions may deviate from the indisputable ground truth.

知識 (knowledge) · 語言模型化 · MoDELS · NLU · Learning ·

2022 年 11 月 17 日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Linmei Hu,Zeyi Liu,Ziwang Zhao,Lei Hou,Liqiang Nie,Juanzi Li

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge-Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='DKX93'><strong id='Czc9O'></strong><small id='Grd07'></small><button id='4oezo'></button><li id='VBNex'><noscript id='OESnf'><big id='32HSY'></big><dt id='OBYmI'></dt></noscript></li></tr><ol id='6oQN4'><option id='zylMN'><table id='QfUwO'><blockquote id='c7SKn'><tbody id='8RY4W'></tbody></blockquote></table></option></ol><u id='PTuls'></u><kbd id='Bz21t'><kbd id='HUak8'></kbd></kbd>

<code id='wE5T8'><strong id='a3Y3k'></strong></code>

<fieldset id='hjb2d'></fieldset>

<span id='IjrIU'></span>

<ins id='Q0SrE'></ins>

<acronym id='3hQFW'><em id='QjHPQ'></em><td id='MWlIl'><div id='Y300E'></div></td></acronym><address id='Jv6Rm'><big id='1uxww'><big id='wQvED'></big><legend id='InRoV'></legend></big></address>

<i id='67PtA'><div id='J5uQJ'><ins id='aHBXR'></ins></div></i>

<i id='w4ewS'></i>