五月丁香四月婷婷激情综合_91在线无码精品秘入口九色_人妻思思热精品在线_亚洲国产日韩欧精品一区二区三区_日本一区二区三区电影网_日本黄色三级片尤物视频_国产精品一区区无码绯色AV

We present an open-source toolkit for neural machine translation (NMT). The new toolkit is mainly based on vaulted Transformer (Vaswani et al., 2017) along with many other improvements detailed below, in order to create a self-contained, simple to use, consistent and comprehensive framework for Machine Translation tasks of various domains. It is tooled to support both bilingual and multilingual translation tasks, starting from building the model from respective corpora, to inferring new predictions or packaging the model to serving-capable JIT format.

相關內容

Machine Translation

關注 209

機(ji)器翻譯（Machine Translation）涵蓋計(ji)算語(yu)言(yan)(yan)學和語(yu)言(yan)(yan)工程的(de)所有分(fen)支，包(bao)含多語(yu)言(yan)(yan)方(fang)面。特色論(lun)文涵蓋理論(lun)，描述(shu)或計(ji)算方(fang)面的(de)任(ren)何下列(lie)主題:雙(shuang)語(yu)和多語(yu)語(yu)料庫的(de)編寫和使用，計(ji)算機(ji)輔助語(yu)言(yan)(yan)教學，非(fei)羅馬字符集的(de)計(ji)算含義，連接(jie)主義翻譯方(fang)法，對比(bi)語(yu)言(yan)(yan)學等。官網地(di)址：

Machine Translation · MoDELS · 模型評估 · 推斷 · 任務對話系統 ·

2022 年 4 月 20 日

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

Yisheng Xiao,Lijun Wu,Junliang Guo,Juntao Li,Min Zhang,Tao Qin,Tie-yan Liu

from arxiv, 25 pages, 11 figures, 4 tables

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing communities. While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, auto-regressive (AR) generation. In recent years, many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation. In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects. Specifically, we categorize the efforts of NAT into several groups, including data manipulation, modeling methods, training criterion, decoding algorithms, and the benefit from pre-trained models. Furthermore, we briefly review other applications of NAR models beyond machine translation, such as dialogue generation, text summarization, grammar error correction, semantic parsing, speech synthesis, and automatic speech recognition. In addition, we also discuss potential directions for future exploration, including releasing the dependency of KD, dynamic length prediction, pre-training for NAR, and wider applications, etc. We hope this survey can help researchers capture the latest progress in NAR generation, inspire the design of advanced NAR models and algorithms, and enable industry practitioners to choose appropriate solutions for their applications. The web page of this survey is at \url{//github.com/LitterBrother-Xiao/Overview-of-Non-autoregressive-Applications}.

Machine Translation · 自適應學習 · NMT · Performer · MoDELS ·

2022 年 4 月 20 日

DaLC: Domain Adaptation Learning Curve Prediction for Neural Machine Translation

Cheonbok Park,Hantae Kim,Ioan Calapodescu,Hyunchang Cho,Vassilina Nikoulina

from arxiv, to be published in ACL2021

Domain Adaptation (DA) of Neural Machine Translation (NMT) model often relies on a pre-trained general NMT model which is adapted to the new domain on a sample of in-domain parallel data. Without parallel data, there is no way to estimate the potential benefit of DA, nor the amount of parallel samples it would require. It is however a desirable functionality that could help MT practitioners to make an informed decision before investing resources in dataset creation. We propose a Domain adaptation Learning Curve prediction (DaLC) model that predicts prospective DA performance based on in-domain monolingual samples in the source language. Our model relies on the NMT encoder representations combined with various instance and corpus-level features. We demonstrate that instance-level is better able to distinguish between different domains compared to corpus-level frameworks proposed in previous studies. Finally, we perform in-depth analyses of the results highlighting the limitations of our approach, and provide directions for future research.

Machine Translation · MoDELS · 長短期記憶網絡 · 講稿 · BLEU ·

2022 年 4 月 19 日

PICT@DravidianLangTech-ACL2022: Neural Machine Translation On Dravidian Languages

Aditya Vyawahare,Rahul Tangsali,Aditya Mandke,Onkar Litake,Dipali Kadam

This paper presents a summary of the findings that we obtained based on the shared task on machine translation of Dravidian languages. We stood first in three of the five sub-tasks which were assigned to us for the main shared task. We carried out neural machine translation for the following five language pairs: Kannada to Tamil, Kannada to Telugu, Kannada to Malayalam, Kannada to Sanskrit, and Kannada to Tulu. The datasets for each of the five language pairs were used to train various translation models, including Seq2Seq models such as LSTM, bidirectional LSTM, Conv2Seq, and training state-of-the-art as transformers from scratch, and fine-tuning already pre-trained models. For some models involving monolingual corpora, we implemented backtranslation as well. These models' accuracy was later tested with a part of the same dataset using BLEU score as an evaluation metric.

學成 · MoDELS · Extensibility · 代碼 · 參數化模型 ·

2022 年 4 月 19 日

H4D: Human 4D Modeling by Learning Neural Compositional Representation

Boyan Jiang,Yinda Zhang,Xingkui Wei,Xiangyang Xue,Yanwei Fu

from arxiv, Accepted by CVPR 2022. Project page: //boyanjiang.github.io/H4D/

Despite the impressive results achieved by deep learning based 3D reconstruction, the techniques of directly learning to model 4D human captures with detailed geometry have been less studied. This work presents a novel framework that can effectively learn a compact and compositional representation for dynamic human by exploiting the human body prior from the widely used SMPL parametric model. Particularly, our representation, named H4D, represents a dynamic 3D human over a temporal span with the SMPL parameters of shape and initial pose, and latent codes encoding motion and auxiliary information. A simple yet effective linear motion model is proposed to provide a rough and regularized motion estimation, followed by per-frame compensation for pose and geometry details with the residual encoded in the auxiliary code. Technically, we introduce novel GRU-based architectures to facilitate learning and improve the representation capability. Extensive experiments demonstrate our method is not only efficacy in recovering dynamic human with accurate motion and detailed geometry, but also amenable to various 4D human related tasks, including motion retargeting, motion completion and future prediction. Please check out the project page for video and code: //boyanjiang.github.io/H4D/.

估計/估計量 · Machine Translation · 預測器/決策函數 · 有向 · MoDELS ·

2022 年 4 月 18 日

MDQE: A More Accurate Direct Pretraining for Machine Translation Quality Estimation

Lei Lin

from arxiv, Just some ideas of my own, not supported by experiments

It is expensive to evaluate the results of Machine Translation(MT), which usually requires manual translation as a reference. Machine Translation Quality Estimation (QE) is a task of predicting the quality of machine translations without relying on any reference. Recently, the emergence of predictor-estimator framework which trains the predictor as a feature extractor and estimator as a QE predictor, and pre-trained language models(PLM) have achieved promising QE performance. However, we argue that there are still gaps between the predictor and the estimator in both data quality and training objectives, which preclude QE models from benefiting from a large number of parallel corpora more directly. Based on previous related work that have alleviated gaps to some extent, we propose a novel framework that provides a more accurate direct pretraining for QE tasks. In this framework, a generator is trained to produce pseudo data that is closer to the real QE data, and a estimator is pretrained on these data with novel objectives that are the same as the QE task. Experiments on widely used benchmarks show that our proposed framework outperforms existing methods, without using any pretraining models such as BERT.

NMT · Machine Translation · 詞向量表示 · 可交換的 · MoDELS ·

2022 年 4 月 18 日

From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding Tables

Krtin Kumar,Peyman Passban,Mehdi Rezagholizadeh,Yiu Sing Lau,Qun Liu

Embedding matrices are key components in neural natural language processing (NLP) models that are responsible to provide numerical representations of input tokens.\footnote{In this paper words and subwords are referred to as \textit{tokens} and the term \textit{embedding} only refers to embeddings of inputs.} In this paper, we analyze the impact and utility of such matrices in the context of neural machine translation (NMT). We show that detracting syntactic and semantic information from word embeddings and running NMT systems with random embeddings is not as damaging as it initially sounds. We also show how incorporating only a limited amount of task-specific knowledge from fully-trained embeddings can boost the performance NMT systems. Our findings demonstrate that in exchange for negligible deterioration in performance, any NMT model can be run with partially random embeddings. Working with such structures means a minimal memory requirement as there is no longer need to store large embedding tables, which is a significant gain in industrial and on-device settings. We evaluated our embeddings in translating {English} into {German} and {French} and achieved a $5.3$x compression rate. Despite having a considerably smaller architecture, our models in some cases are even able to outperform state-of-the-art baselines.

穩健性 · Machine Translation · MoDELS · Integration · NMT ·

2022 年 4 月 17 日

Improving both domain robustness and domain adaptability in machine translation

Wen Lai,Jind?ich Libovicky,Alexander Fraser

from arxiv, update English-Chinese results

We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learning when improving the domain robustness of the model. In this paper, we propose a novel approach, RMLNMT (Robust Meta-Learning Framework for Neural Machine Translation Domain Adaptation), which improves the robustness of existing meta-learning models. More specifically, we show how to use a domain classifier in curriculum learning and we integrate the word-level domain mixing model into the meta-learning framework with a balanced sampling strategy. Experiments on English$\rightarrow$German and English$\rightarrow$Chinese translation show that RMLNMT improves in terms of both domain robustness and domain adaptability in seen and unseen domains.

MoDELS · 學成 · 目標領域 · Machine Learning · CASES ·

2022 年 4 月 14 日

Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning

Pin-Yu Chen

from arxiv, Survey paper on model reprogramming; Project repository: //github.com/IBM/model-reprogramming

In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models and can even learn general task-agnostic representations for efficient finetuning to downstream tasks. However, deep learning in resource-limited domains still faces the following challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning. This paper introduces a new technique called model reprogramming to bridge this gap. Model reprogramming enables resource-efficient cross-domain machine learning by repurposing and reusing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning, where the source and target domains can be vastly different. In many applications, model reprogramming outperforms transfer learning and training from scratch. This paper elucidates the methodology of model reprogramming, summarizes existing use cases, provides a theoretical explanation on the success of model reprogramming, and concludes with a discussion on open-ended research questions and opportunities. A list of model reprogramming studies is actively maintained and updated at //github.com/IBM/model-reprogramming.

Machine Learning · 學成 · 可辨認的 · MoDELS · 可理解性 ·

2020 年 10 月 20 日

Counterfactual Explanations for Machine Learning: A Review

Sahil Verma,John Dickerson,Keegan Hines

from arxiv, 10 pages

Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine-learning-based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently-proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.

Machine Translation · NMT · Performer · state-of-the-art · 學成 ·

2018 年 6 月 1 日

A Survey of Domain Adaptation for Neural Machine Translation

Chenhui Chu,Rui Wang

from arxiv, COLING 2018, 16 pages, 9 figures

Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.