五月丁香四月婷婷激情综合,日本高清区一区二区三区四区五区,成人A视频片在线观看免费,亚洲色无码专区在线观看精,色一欲一性一乱一区二区三区

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing communities. While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, autoregressive (AR) generation. In recent years, many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation. In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects. Specifically, we categorize the efforts of NAT into several groups, including data manipulation, modeling methods, training criterion, decoding algorithms, and the benefit from pre-trained models. Furthermore, we briefly review other applications of NAR models beyond machine translation, such as grammatical error correction, text summarization, text style transfer, dialogue, semantic parsing, automatic speech recognition, and so on. In addition, we also discuss potential directions for future exploration, including releasing the dependency of KD, reasonable training objectives, pre-training for NAR, and wider applications, etc. We hope this survey can help researchers capture the latest progress in NAR generation, inspire the design of advanced NAR models and algorithms, and enable industry practitioners to choose appropriate solutions for their applications. The web page of this survey is at \url{//github.com/LitterBrother-Xiao/Overview-of-Non-autoregressive-Applications}.

相關內容

Machine Translation

關注 209

機器翻譯（Machine Translation）涵蓋計算語言學和語言工程的所有分支，包含多語言方面。特色論文涵蓋理論，描述或計算方面的任何下列主題:雙語和多語語料庫的編寫和使用，計算機輔助語言教學，非羅馬字符集的計算含義，連接主義翻譯方法，對比語言學等。官網地址：

Analysis · 類別 · 情景 · Learning · Performer ·

2023 年 8 月 29 日

Rotation Augmented Distillation for Exemplar-Free Class Incremental Learning with Detailed Analysis

Xiuwei Chen,Xiaobin Chang

Class incremental learning (CIL) aims to recognize both the old and new classes along the increment tasks. Deep neural networks in CIL suffer from catastrophic forgetting and some approaches rely on saving exemplars from previous tasks, known as the exemplar-based setting, to alleviate this problem. On the contrary, this paper focuses on the Exemplar-Free setting with no old class sample preserved. Balancing the plasticity and stability in deep feature learning with only supervision from new classes is more challenging. Most existing Exemplar-Free CIL methods report the overall performance only and lack further analysis. In this work, different methods are examined with complementary metrics in greater detail. Moreover, we propose a simple CIL method, Rotation Augmented Distillation (RAD), which achieves one of the top-tier performances under the Exemplar-Free setting. Detailed analysis shows our RAD benefits from the superior balance between plasticity and stability. Finally, more challenging exemplar-free settings with fewer initial classes are undertaken for further demonstrations and comparisons among the state-of-the-art methods.

SAT · Continuity · SOTA · FFT · state-of-the-art ·

2023 年 8 月 29 日

Massively Parallel Continuous Local Search for Hybrid SAT Solving on GPUs

Yunuo Cen,Zhiwei Zhang,Xuanyao Fong

Although state-of-the-art (SOTA) SAT solvers based on conflict-driven clause learning (CDCL) have achieved remarkable engineering success, their sequential nature limits the parallelism that may be extracted for acceleration on platforms such as the graphics processing unit (GPU). In this work, we propose FastFourierSAT, a highly parallel hybrid SAT solver based on gradient-driven continuous local search (CLS). This is realized by a novel parallel algorithm inspired by the Fast Fourier Transform (FFT)-based convolution for computing the elementary symmetric polynomials (ESPs), which is the major computational task in previous CLS methods. The complexity of our algorithm matches the best previous result. Furthermore, the substantial parallelism inherent in our algorithm can leverage the GPU for acceleration, demonstrating significant improvement over the previous CLS approaches. We also propose to incorporate the restart heuristics in CLS to improve search efficiency. We compare our approach with the SOTA parallel SAT solvers on several benchmarks. Our results show that FastFourierSAT computes the gradient 100+ times faster than previous prototypes implemented on CPU. Moreover, FastFourierSAT solves most instances and demonstrates promising performance on larger-size instances.

INTERACT · 穩健性 · Learning · MoDELS · 遷移學習 ·

2023 年 8 月 28 日

Robust Activity Recognition for Adaptive Worker-Robot Interaction using Transfer Learning

Farid Shahnavaz,Riley Tavassoli,Reza Akhavian

from arxiv, 2023 ASCE International Conference on Computing in Civil Engineering (I3CE)

Human activity recognition (HAR) using machine learning has shown tremendous promise in detecting construction workers' activities. HAR has many applications in human-robot interaction research to enable robots' understanding of human counterparts' activities. However, many existing HAR approaches lack robustness, generalizability, and adaptability. This paper proposes a transfer learning methodology for activity recognition of construction workers that requires orders of magnitude less data and compute time for comparable or better classification accuracy. The developed algorithm transfers features from a model pre-trained by the original authors and fine-tunes them for the downstream task of activity recognition in construction. The model was pre-trained on Kinetics-400, a large-scale video-based human activity recognition dataset with 400 distinct classes. The model was fine-tuned and tested using videos captured from manual material handling (MMH) activities found on YouTube. Results indicate that the fine-tuned model can recognize distinct MMH tasks in a robust and adaptive manner which is crucial for the widespread deployment of collaborative robots in construction.

Machine Translation · 得分 · 數據集 · Learning · 設計 ·

2023 年 8 月 28 日

Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level

Daniel Deutsch,Juraj Juraska,Mara Finkelstein,Markus Freitag

from arxiv, Removing extra "and" from author list

As research on machine translation moves to translating text beyond the sentence level, it remains unclear how effective automatic evaluation metrics are at scoring longer translations. In this work, we first propose a method for creating paragraph-level data for training and meta-evaluating metrics from existing sentence-level data. Then, we use these new datasets to benchmark existing sentence-level metrics as well as train learned metrics at the paragraph level. Interestingly, our experimental results demonstrate that using sentence-level metrics to score entire paragraphs is equally as effective as using a metric designed to work at the paragraph level. We speculate this result can be attributed to properties of the task of reference-based evaluation as well as limitations of our datasets with respect to capturing all types of phenomena that occur in paragraph-level translations.

語音合成 · Learning · 估計/估計量 · GANs · 對抗學習 ·

2023 年 8 月 28 日

Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech

Hyungchan Yoon,Seyun Um,Changwhan Kim,Hong-Goo Kang

from arxiv, INTERSPEECH 2023

To simplify the generation process, several text-to-speech (TTS) systems implicitly learn intermediate latent representations instead of relying on predefined features (e.g., mel-spectrogram). However, their generation quality is unsatisfactory as these representations lack speech variances. In this paper, we improve TTS performance by adding \emph{prosody embeddings} to the latent representations. During training, we extract reference prosody embeddings from mel-spectrograms, and during inference, we estimate these embeddings from text using generative adversarial networks (GANs). Using GANs, we reliably estimate the prosody embeddings in a fast way, which have complex distributions due to the dynamic nature of speech. We also show that the prosody embeddings work as efficient features for learning a robust alignment between text and acoustic features. Our proposed model surpasses several publicly available models with less parameters and computational complexity in comparative experiments.

穩健性 · Performer · MoDELS · 數據增強 · Learning ·

2023 年 8 月 28 日

Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics

Jiacheng Zhu,Jielin Qiu,Aritra Guha,Zhuolin Yang,Xuanlong Nguyen,Bo Li,Ding Zhao

from arxiv, 34 pages, 3 figures, 18 tables

We propose to study and promote the robustness of a model as per its performance through the interpolation of training data distributions. Specifically, (1) we augment the data by finding the worst-case Wasserstein barycenter on the geodesic connecting subpopulation distributions of different categories. (2) We regularize the model for smoother performance on the continuous geodesic path connecting subpopulation distributions. (3) Additionally, we provide a theoretical guarantee of robustness improvement and investigate how the geodesic location and the sample size contribute, respectively. Experimental validations of the proposed strategy on \textit{four} datasets, including CIFAR-100 and ImageNet, establish the efficacy of our method, e.g., our method improves the baselines' certifiable robustness on CIFAR10 up to $7.7\%$, with $16.8\%$ on empirical robustness on CIFAR-100. Our work provides a new perspective of model robustness through the lens of Wasserstein geodesic-based interpolation with a practical off-the-shelf strategy that can be combined with existing robust training methods.

Integration · Processing（編程語言） · Performer · 代價 · FAST ·

2023 年 8 月 27 日

Integrated Variational Fourier Features for Fast Spatial Modelling with Gaussian Processes

Talay M Cheema,Carl Edward Rasmussen

Sparse variational approximations are popular methods for scaling up inference and learning in Gaussian processes to larger datasets. For $N$ training points, exact inference has $O(N^3)$ cost; with $M \ll N$ features, state of the art sparse variational methods have $O(NM^2)$ cost. Recently, methods have been proposed using more sophisticated features; these promise $O(M^3)$ cost, with good performance in low dimensional tasks such as spatial modelling, but they only work with a very limited class of kernels, excluding some of the most commonly used. In this work, we propose integrated Fourier features, which extends these performance benefits to a very broad class of stationary covariance functions. We motivate the method and choice of parameters from a convergence analysis and empirical exploration, and show practical speedup in synthetic and real world spatial regression tasks.

Tensor · 壓縮感知 · 近似 · 因子分解 · 流 ·

2023 年 8 月 25 日

Fast and Low-Memory Compressive Sensing Algorithms for Low Tucker-Rank Tensor Approximation from Streamed Measurements

Cullen Haselby,Mark A. Iwen,Deanna Needell,Elizaveta Rebrova,William Swartworth

from arxiv, 59 pages, 8 figures

In this paper we consider the problem of recovering a low-rank Tucker approximation to a massive tensor based solely on structured random compressive measurements. Crucially, the proposed random measurement ensembles are both designed to be compactly represented (i.e., low-memory), and can also be efficiently computed in one-pass over the tensor. Thus, the proposed compressive sensing approach may be used to produce a low-rank factorization of a huge tensor that is too large to store in memory with a total memory footprint on the order of the much smaller desired low-rank factorization. In addition, the compressive sensing recovery algorithm itself (which takes the compressive measurements as input, and then outputs a low-rank factorization) also runs in a time which principally depends only on the size of the sought factorization, making its runtime sub-linear in the size of the large tensor one is approximating. Finally, unlike prior works related to (streaming) algorithms for low-rank tensor approximation from such compressive measurements, we present a unified analysis of both Kronecker and Khatri-Rao structured measurement ensembles culminating in error guarantees comparing the error of our recovery algorithm's approximation of the input tensor to the best possible low-rank Tucker approximation error achievable for the tensor by any possible algorithm. We further include an empirical study of the proposed approach that verifies our theoretical findings and explores various trade-offs of parameters of interest.

秩 · Facebook AI Research · 優化器 · MoDELS · Better ·

2023 年 8 月 25 日

Optimizing Group-Fair Plackett-Luce Ranking Models for Relevance and Ex-Post Fairness

Sruthi Gorantla,Eshaan Bhansali,Amit Deshpande,Anand Louis

from arxiv, 20 pages

In learning-to-rank (LTR), optimizing only the relevance (or the expected ranking utility) can cause representational harm to certain categories of items. Moreover, if there is implicit bias in the relevance scores, LTR models may fail to optimize for true relevance. Previous works have proposed efficient algorithms to train stochastic ranking models that achieve fairness of exposure to the groups ex-ante (or, in expectation), which may not guarantee representation fairness to the groups ex-post, that is, after realizing a ranking from the stochastic ranking model. Typically, ex-post fairness is achieved by post-processing, but previous work does not train stochastic ranking models that are aware of this post-processing. In this paper, we propose a novel objective that maximizes expected relevance only over those rankings that satisfy given representation constraints to ensure ex-post fairness. Building upon recent work on an efficient sampler for ex-post group-fair rankings, we propose a group-fair Plackett-Luce model and show that it can be efficiently optimized for our objective in the LTR framework. Experiments on three real-world datasets show that our group-fair algorithm guarantees fairness alongside usually having better relevance compared to the LTR baselines. In addition, our algorithm also achieves better relevance than post-processing baselines, which also ensures ex-post fairness. Further, when implicit bias is injected into the training data, our algorithm typically outperforms existing LTR baselines in relevance.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.