国产一本二本三本的区别视频,国产欧美日韩视频一区二区

Conformers have recently been proposed as a promising modelling approach for automatic speech recognition (ASR), outperforming recurrent neural network-based approaches and transformers. Nevertheless, in general, the performance of these end-to-end models, especially attention-based models, is particularly degraded in the case of long utterances. To address this limitation, we propose adding a fully-differentiable memory-augmented neural network between the encoder and decoder of a conformer. This external memory can enrich the generalization for longer utterances since it allows the system to store and retrieve more information recurrently. Notably, we explore the neural Turing machine (NTM) that results in our proposed Conformer-NTM model architecture for ASR. Experimental results using Librispeech train-clean-100 and train-960 sets show that the proposed system outperforms the baseline conformer without memory for long utterances.

相關內容

Conformer

關注 0

SSL · 特征空間 · 正則化項 · MoDELS · Learning ·

2023 年 11 月 7 日

Feature Space Renormalization for Semi-supervised Learning

Jun Sun,Zhongjie Mao,Chao Li,Chao Zhou,Xiao-Jun Wu

from arxiv, Version 1

Semi-supervised learning (SSL) has been proven to be a powerful method for leveraging unlabelled data to alleviate models' dependence on large labelled datasets. The common framework among recent approaches is to train the model on a large amount of unlabelled data with consistency regularization to constrain the model predictions to be invariant to input perturbation. However, the existing SSL frameworks still have room for improvement in the consistency regularization method. Instead of regularizing category predictions in the label space as in existing frameworks, this paper proposes a feature space renormalization (FSR) mechanism for SSL. First, we propose a feature space renormalization mechanism to substitute for the commonly used consistency regularization mechanism to learn better discriminative features. To apply this mechanism, we start by building a basic model and an empirical model and then introduce our mechanism to renormalize the feature learning of the basic model with the guidance of the empirical model. Second, we combine the proposed mechanism with pseudo-labelling to obtain a novel effective SSL model named FreMatch. The experimental results show that our method can achieve better performance on a variety of standard SSL benchmark datasets, and the proposed feature space renormalization mechanism can also enhance the performance of other SSL approaches.

語音翻譯 · Learning · 端到端 · SOTA · state-of-the-art ·

2023 年 11 月 7 日

Rethinking and Improving Multi-task Learning for End-to-end Speech Translation

Yuhao Zhang,Chen Xu,Bei Li,Hao Chen,Tong Xiao,Chunliang Zhang,Jingbo Zhu

from arxiv, Accepted to EMNLP2023 main conference

Significant improvements in end-to-end speech translation (ST) have been achieved through the application of multi-task learning. However, the extent to which auxiliary tasks are highly consistent with the ST task, and how much this approach truly helps, have not been thoroughly studied. In this paper, we investigate the consistency between different tasks, considering different times and modules. We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations. Furthermore, we propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation. We conduct experiments on the MuST-C dataset. The results demonstrate that our method attains state-of-the-art results. Moreover, when additional data is used, we achieve the new SOTA result on MuST-C English to Spanish task with 20.8% of the training time required by the current SOTA method.

學習器 · Performer · 集成 · CASES · Guidance ·

2023 年 11 月 6 日

Practical considerations for variable screening in the Super Learner

Brian D. Williamson,Drew King,Ying Huang

from arxiv, 14 pages, 4 figures, 1 table

Estimating a prediction function is a fundamental component of many data analyses. The Super Learner ensemble, a particular implementation of stacking, has desirable theoretical properties and has been used successfully in many applications. Dimension reduction can be accomplished by using variable screening algorithms, including the lasso, within the ensemble prior to fitting other prediction algorithms. However, the performance of a Super Learner using the lasso for dimension reduction has not been fully explored in cases where the lasso is known to perform poorly. We provide empirical results that suggest that a diverse set of candidate screening algorithms should be used to protect against poor performance of any one screen, similar to the guidance for choosing a library of prediction algorithms for the Super Learner.

Wyner-Ziv · 層 · Learning · 代碼 · 解碼 ·

2023 年 11 月 6 日

Learned layered coding for Successive Refinement in the Wyner-Ziv Problem

Boris Joukovsky,Brent De Weerdt,Nikos Deligiannis

from arxiv, 5 pages, submitted to ICASSP 2024

We propose a data-driven approach to explicitly learn the progressive encoding of a continuous source, which is successively decoded with increasing levels of quality and with the aid of correlated side information. This setup refers to the successive refinement of the Wyner-Ziv coding problem. Assuming ideal Slepian-Wolf coding, our approach employs recurrent neural networks (RNNs) to learn layered encoders and decoders for the quadratic Gaussian case. The models are trained by minimizing a variational bound on the rate-distortion function of the successively refined Wyner-Ziv coding problem. We demonstrate that RNNs can explicitly retrieve layered binning solutions akin to scalable nested quantization. Moreover, the rate-distortion performance of the scheme is on par with the corresponding monolithic Wyner-Ziv coding approach and is close to the rate-distortion bound.

近似 · 相互獨立的 · 推斷 · 近似誤差 · Automator ·

2023 年 11 月 5 日

Independent finite approximations for Bayesian nonparametric inference

Tin D. Nguyen,Jonathan Huggins,Lorenzo Masoero,Lester Mackey,Tamara Broderick

from arxiv, The paper has been accepted for publication in Bayesian Analysis. Currently, it is posted on Bayesian Analysis Advance Publication

Completely random measures (CRMs) and their normalizations (NCRMs) offer flexible models in Bayesian nonparametrics. But their infinite dimensionality presents challenges for inference. Two popular finite approximations are truncated finite approximations (TFAs) and independent finite approximations (IFAs). While the former have been well-studied, IFAs lack similarly general bounds on approximation error, and there has been no systematic comparison between the two options. In the present work, we propose a general recipe to construct practical finite-dimensional approximations for homogeneous CRMs and NCRMs, in the presence or absence of power laws. We call our construction the automated independent finite approximation (AIFA). Relative to TFAs, we show that AIFAs facilitate more straightforward derivations and use of parallel computing in approximate inference. We upper bound the approximation error of AIFAs for a wide class of common CRMs and NCRMs -- and thereby develop guidelines for choosing the approximation level. Our lower bounds in key cases suggest that our upper bounds are tight. We prove that, for worst-case choices of observation likelihoods, TFAs are more efficient than AIFAs. Conversely, we find that in real-data experiments with standard likelihoods, AIFAs and TFAs perform similarly. Moreover, we demonstrate that AIFAs can be used for hyperparameter estimation even when other potential IFA options struggle or do not apply.

語言模型化 · MoDELS · entity · 訓練數據 · Extensibility ·

2023 年 11 月 5 日

Quantifying and Analyzing Entity-level Memorization in Large Language Models

Zhenhong Zhou,Jiuyang Xiang,Chaomeng Chen,Sen Su

from arxiv, 9 pages, 7 figures

Large language models (LLMs) have been proven capable of memorizing their training data, which can be extracted through specifically designed prompts. As the scale of datasets continues to grow, privacy risks arising from memorization have attracted increasing attention. Quantifying language model memorization helps evaluate potential privacy risks. However, prior works on quantifying memorization require access to the precise original data or incur substantial computational overhead, making it difficult for applications in real-world language models. To this end, we propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios. In addition, we also present an approach for efficiently extracting sensitive entities from autoregressive language models. We conduct extensive experiments based on the proposed, probing language models' ability to reconstruct sensitive entities under different settings. We find that language models have strong memorization at the entity level and are able to reproduce the training data even with partial leakages. The results demonstrate that LLMs not only memorize their training data but also understand associations between entities. These findings necessitate that trainers of LLMs exercise greater prudence regarding model memorization, adopting memorization mitigation techniques to preclude privacy violations.

圖 · 泛函 · 講稿 · 論文 · 離散數學 ·

2023 年 11 月 3 日

Functionality of box intersection graphs

Clément Dallard,Vadim Lozin,Martin Milani?,Kenny ?torgel,Viktor Zamaraev

from arxiv, 11 pages

Functionality is a graph complexity measure that extends a variety of parameters, such as vertex degree, degeneracy, clique-width, or twin-width. In the present paper, we show that functionality is bounded for box intersection graphs in $\mathbb{R}^1$, i.e. for interval graphs, and unbounded for box intersection graphs in $\mathbb{R}^3$. We also study a parameter known as symmetric difference, which is intermediate between twin-width and functionality, and show that this parameter is unbounded both for interval graphs and for unit box intersection graphs in $\mathbb{R}^2$.

MoDELS · Extensibility · Pivotal（公司） · 操作 · 泛化理論 ·

2023 年 11 月 3 日

Quantum circuit synthesis with diffusion models

Florian Fürrutter,Gorka Mu?oz-Gil,Hans J. Briegel

from arxiv, Main Text: 6 pages and 4 figures; Appendix: 6 pages, 4 figures and 3 tables. Code available at: //github.com/FlorianFuerrutter/genQC

Quantum computing has recently emerged as a transformative technology. Yet, its promised advantages rely on efficiently translating quantum operations into viable physical realizations. In this work, we use generative machine learning models, specifically denoising diffusion models (DMs), to facilitate this transformation. Leveraging text-conditioning, we steer the model to produce desired quantum operations within gate-based quantum circuits. Notably, DMs allow to sidestep during training the exponential overhead inherent in the classical simulation of quantum dynamics -- a consistent bottleneck in preceding ML techniques. We demonstrate the model's capabilities across two tasks: entanglement generation and unitary compilation. The model excels at generating new circuits and supports typical DM extensions such as masking and editing to, for instance, align the circuit generation to the constraints of the targeted quantum device. Given their flexibility and generalization abilities, we envision DMs as pivotal in quantum circuit synthesis, enhancing both practical applications but also insights into theoretical quantum computation.

潛在 · 可辨認的 · 隱變量 · Learning · 圖 ·

2023 年 11 月 3 日

Learning nonparametric latent causal graphs with unknown interventions

Yibo Jiang,Bryon Aragam

from arxiv, To appear at NeurIPS 2023

We establish conditions under which latent causal graphs are nonparametrically identifiable and can be reconstructed from unknown interventions in the latent space. Our primary focus is the identification of the latent structure in measurement models without parametric assumptions such as linearity or Gaussianity. Moreover, we do not assume the number of hidden variables is known, and we show that at most one unknown intervention per hidden variable is needed. This extends a recent line of work on learning causal representations from observations and interventions. The proofs are constructive and introduce two new graphical concepts -- imaginary subsets and isolated edges -- that may be useful in their own right. As a matter of independent interest, the proofs also involve a novel characterization of the limits of edge orientations within the equivalence class of DAGs induced by unknown interventions. These are the first results to characterize the conditions under which causal representations are identifiable without making any parametric assumptions in a general setting with unknown interventions and without faithfulness.

entity · 標注 · 演繹推理 · Networking · Performer ·

2021 年 9 月 13 日

Fine-grained Entity Typing via Label Reasoning

Qing Liu,Hongyu Lin,Xinyan Xiao,Xianpei Han,Le Sun,Hua Wu

from arxiv, Accepted to the main conference of EMNLP2021

Conventional entity typing approaches are based on independent classification paradigms, which make them difficult to recognize inter-dependent, long-tailed and fine-grained entity types. In this paper, we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges. To this end, we propose \emph{Label Reasoning Network(LRN)}, which sequentially reasons fine-grained entity labels by discovering and exploiting label dependencies knowledge entailed in the data. Specifically, LRN utilizes an auto-regressive network to conduct deductive reasoning and a bipartite attribute graph to conduct inductive reasoning between labels, which can effectively model, learn and reason complex label dependencies in a sequence-to-set, end-to-end manner. Experiments show that LRN achieves the state-of-the-art performance on standard ultra fine-grained entity typing benchmarks, and can also resolve the long tail label problem effectively.