亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='0lpry'></tfoot>

<legend id='0lpry'><style id='0lpry'><dir id='0lpry'><q id='0lpry'></q></dir></style></legend>

<i id='0lpry'><tr id='0lpry'><dt id='0lpry'><q id='0lpry'><span id='0lpry'><b id='0lpry'><form id='0lpry'><ins id='0lpry'></ins><ul id='0lpry'></ul><sub id='0lpry'></sub></form><legend id='0lpry'></legend><bdo id='0lpry'><pre id='0lpry'><center id='0lpry'></center></pre></bdo></b><th id='0lpry'></th></span></q></dt></tr></i><div id='0lpry'><tfoot id='0lpry'></tfoot><dl id='0lpry'><fieldset id='0lpry'></fieldset></dl></div>

·

語言模型化 · CLUE · MoDELS · Performer · NLU ·

2021 年 8 月 2 日

LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization

Weidong Guo,Mingjun Zhao,Lusheng Zhang,Di Niu,Jinwen Luo,Zhenhua Liu,Zhenyang Li,Jianbo Tang

from arxiv, Accepted by ACL Findings 2021

Language model pre-training based on large corpora has achieved tremendous success in terms of constructing enriched contextual representations and has led to significant performance gains on a diverse range of Natural Language Understanding (NLU) tasks. Despite the success, most current pre-trained language models, such as BERT, are trained based on single-grained tokenization, usually with fine-grained characters or sub-words, making it hard for them to learn the precise meaning of coarse-grained words and phrases. In this paper, we propose a simple yet effective pre-training method named LICHEE to efficiently incorporate multi-grained information of input text. Our method can be applied to various pre-trained language models and improve their representation capability. Extensive experiments conducted on CLUE and SuperGLUE demonstrate that our method achieves comprehensive improvements on a wide variety of NLU tasks in both Chinese and English with little extra inference cost incurred, and that our best ensemble model achieves the state-of-the-art performance on CLUE benchmark competition.

相關內容

語言模型化

語言模型化

掩碼語言模型化 · 語言模型化 · MoDELS · 掩碼 · state-of-the-art ·

2021 年 9 月 30 日

SlovakBERT: Slovak Masked Language Model

Matú? Pikuliak,?tefan Grivalsky,Martin Kon?pka,Miroslav Bl?ták,Martin Tamajka,Viktor Bachraty,Marián ?imko,Pavol Balá?ik,Michal Trnka,Filip Uhlárik

from arxiv, 22 pages, 2 figures

We introduce a new Slovak masked language model called SlovakBERT in this paper. It is the first Slovak-only transformers-based model trained on a sizeable corpus. We evaluate the model on several NLP tasks and achieve state-of-the-art results. We publish the masked language model, as well as the subsequently fine-tuned models for part-of-speech tagging, sentiment analysis and semantic textual similarity.

單峰值 · 可理解性 · Performer · Better · MoDELS ·

2021 年 9 月 28 日

VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding

Hu Xu,Gargi Ghosh,Po-Yao Huang,Prahal Arora,Masoumeh Aminzadeh,Christoph Feichtenhofer,Florian Metze,Luke Zettlemoyer

from arxiv, 9 pages, ACL Findings 2021

We present a simplified, task-agnostic multi-modal pre-training approach that can accept either video or text input, or both for a variety of end tasks. Existing pre-training are task-specific by adopting either a single cross-modal encoder that requires both modalities, limiting their use for retrieval-style end tasks or more complex multitask learning with two unimodal encoders, limiting early cross-modal fusion. We instead introduce new pretraining masking schemes that better mix across modalities (e.g. by forcing masks for text to predict the closest video embeddings) while also maintaining separability (e.g. unimodal predictions are sometimes required, without using all the input). Experimental results show strong performance across a wider range of tasks than any previous methods, often outperforming task-specific pre-training. Code is made available at //github.com/pytorch/fairseq/examples/MMPT.

語言模型化 · MoDELS · 掩碼 · 詞元分析器 · 掩碼語言模型化 ·

2021 年 6 月 11 日

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

Zewen Chi,Li Dong,Bo Zheng,Shaohan Huang,Xian-Ling Mao,Heyan Huang,Furu Wei

from arxiv, ACL-2021

The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences. In this paper, we introduce denoising word alignment as a new cross-lingual pre-training task. Specifically, the model first self-labels word alignments for parallel sentences. Then we randomly mask tokens in a bitext pair. Given a masked token, the model uses a pointer network to predict the aligned token in the other language. We alternately perform the above two steps in an expectation-maximization manner. Experimental results show that our method improves cross-lingual transferability on various datasets, especially on the token-level tasks, such as question answering, and structured prediction. Moreover, the model can serve as a pretrained word aligner, which achieves reasonably low error rates on the alignment benchmarks. The code and pretrained parameters are available at //github.com/CZWin32768/XLM-Align.

語言模型化 · MoDELS · IR · 似然 · 掩碼語言模型化 ·

2020 年 10 月 20 日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Xinyu Ma,Jiafeng Guo,Ruqing Zhang,Yixing Fan,Xiang Ji,Xueqi Cheng

from arxiv, Accepted by WSDM2021

Recently pre-trained language representation models such as BERT have shown great success when fine-tuned on downstream tasks including information retrieval (IR). However, pre-training objectives tailored for ad-hoc retrieval have not been well explored. In this paper, we propose Pre-training with Representative wOrds Prediction (PROP) for ad-hoc retrieval. PROP is inspired by the classical statistical language model for IR, specifically the query likelihood model, which assumes that the query is generated as the piece of text representative of the "ideal" document. Based on this idea, we construct the representative words prediction (ROP) task for pre-training. Given an input document, we sample a pair of word sets according to the document language model, where the set with higher likelihood is deemed as more representative of the document. We then pre-train the Transformer model to predict the pairwise preference between the two word sets, jointly with the Masked Language Model (MLM) objective. By further fine-tuning on a variety of representative downstream ad-hoc retrieval tasks, PROP achieves significant improvements over baselines without pre-training or with other pre-training methods. We also show that PROP can achieve exciting performance under both the zero- and low-resource IR settings. The code and pre-trained models are available at //github.com/Albert-Ma/PROP.

語言模型化 · MoDELS · 位置嵌入 · 自編碼器 · 掩碼 ·

2020 年 2 月 28 日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao,Li Dong,Furu Wei,Wenhui Wang,Nan Yang,Xiaodong Liu,Yu Wang,Songhao Piao,Jianfeng Gao,Ming Zhou,Hsiao-Wuen Hon

from arxiv, 11 pages

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.

語言表示 · BERT · 可理解性 · MoDELS · 機器閱讀理解 ·

2019 年 9 月 5 日

Semantics-aware BERT for Language Understanding

Zhuosheng Zhang,Yuwei Wu,Hai Zhao,Zuchao Li,Shuailiang Zhang,Xi Zhou,Xiang Zhou

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks. However, the existing language representation models including ELMo, GPT and BERT only exploit plain context-sensitive features such as character or word embeddings. They rarely consider incorporating structured semantic information which can provide rich semantics for language representation. To promote natural language understanding, we propose to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduce an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone. SemBERT keeps the convenient usability of its BERT precursor in a light fine-tuning way without substantial task-specific modifications. Compared with BERT, semantics-aware BERT is as simple in concept but more powerful. It obtains new state-of-the-art or substantially improves results on ten reading comprehension and language inference tasks.

語言表示 · 小樣本學習 · 文本分類 · 學成 · Performer ·

2019 年 8 月 22 日

Improving Few-shot Text Classification via Pretrained Language Representations

Ningyu Zhang,Zhanlin Sun,Shumin Deng,Jiaoyan Chen,Huajun Chen

from arxiv, arXiv admin note: substantial text overlap with arXiv:1902.10482, arXiv:1803.02400 by other authors

Text classification tends to be difficult when the data is deficient or when it is required to adapt to unseen classes. In such challenging scenarios, recent studies have often used meta-learning to simulate the few-shot task, thus negating explicit common linguistic features across tasks. Deep language representations have proven to be very effective forms of unsupervised pretraining, yielding contextualized features that capture linguistic properties and benefit downstream natural language understanding tasks. However, the effect of pretrained language representation for few-shot learning on text classification tasks is still not well understood. In this study, we design a few-shot learning model with pretrained language representations and report the empirical results. We show that our approach is not only simple but also produces state-of-the-art performance on a well-studied sentiment classification dataset. It can thus be further suggested that pretraining could be a promising solution for few shot learning of many other NLP tasks. The code and the dataset to replicate the experiments are made available at //github.com/zxlzr/FewShotNLP.

語言表示 · entity · INFORMS · MoDELS · 語言模型化 ·

2019 年 5 月 17 日

ERNIE: Enhanced Language Representation with Informative Entities

Zhengyan Zhang,Xu Han,Zhiyuan Liu,Xin Jiang,Maosong Sun,Qun Liu

from arxiv, Accepted by ACL 2019

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code of this paper can be obtained from //github.com/thunlp/ERNIE.

變換 · MoDELS · Performer · state-of-the-art · Transformer模型 ·

2018 年 10 月 8 日

Improving the Transformer Translation Model with Document-Level Context

Jiacheng Zhang,Huanbo Luan,Maosong Sun,FeiFei Zhai,Jingfang Xu,Min Zhang,Yang Liu

from arxiv, EMNLP 2018

Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge. In this work, we extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. As large-scale document-level parallel corpora are usually not available, we introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. Experiments on the NIST Chinese-English datasets and the IWSLT French-English datasets show that our approach improves over Transformer significantly.

語言模型化 · 文本分類 · state-of-the-art · MoDELS · 可約的 ·

2018 年 1 月 18 日

Fine-tuned Language Models for Text Classification

Jeremy Howard,Sebastian Ruder

Transfer learning has revolutionized computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Fine-tuned Language Models (FitLaM), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a state-of-the-art language model. Our method significantly outperforms the state-of-the-art on five text classification tasks, reducing the error by 18-24% on the majority of datasets. We open-source our pretrained models and code to enable adoption by the community.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

語言模型化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='QXgxA'><strong id='FH0o5'></strong><small id='If6sf'></small><button id='6V0M5'></button><li id='Qz7tR'><noscript id='A9PIE'><big id='DBFAF'></big><dt id='UgXjR'></dt></noscript></li></tr><ol id='0dxS0'><option id='kREYK'><table id='wyMSZ'><blockquote id='r2XPa'><tbody id='H2ATB'></tbody></blockquote></table></option></ol><u id='gJEXu'></u><kbd id='Z9Z56'><kbd id='NbnuH'></kbd></kbd>

<code id='w6wcg'><strong id='miUJK'></strong></code>

<fieldset id='eSsxn'></fieldset>

<span id='9dqLh'></span>

<ins id='MTVkx'></ins>

<acronym id='GoWIW'><em id='2p3k1'></em><td id='LOudg'><div id='NFZ5y'></div></td></acronym><address id='JGbfw'><big id='uDvWe'><big id='BBJCc'></big><legend id='KwBWy'></legend></big></address>

<i id='h90AL'><div id='PzkB7'><ins id='gHl9B'></ins></div></i>

<i id='btVcJ'></i>