99视频在线播放喷射,国产精品性爱视频亚洲国产黄片,亚洲欧美日韩3751色一区二区三区,国产99久久九九精品无,欧美精品AA片在线播放

BigScience Workshop, :,Teven Le Scao,Angela Fan,Christopher Akiki,Ellie Pavlick,Suzana Ili?,Daniel Hesslow,Roman Castagné,Alexandra Sasha Luccioni,Fran?ois Yvon,Matthias Gallé,Jonathan Tow,Alexander M. Rush,Stella Biderman,Albert Webson,Pawan Sasanka Ammanamanchi,Thomas Wang,Beno?t Sagot,Niklas Muennighoff,Albert Villanova del Moral,Olatunji Ruwase,Rachel Bawden,Stas Bekman,Angelina McMillan-Major,Iz Beltagy,Huu Nguyen,Lucile Saulnier,Samson Tan,Pedro Ortiz Suarez,Victor Sanh,Hugo Lauren?on,Yacine Jernite,Julien Launay,Margaret Mitchell,Colin Raffel,Aaron Gokaslan,Adi Simhi,Aitor Soroa,Alham Fikri Aji,Amit Alfassy,Anna Rogers,Ariel Kreisberg Nitzav,Canwen Xu,Chenghao Mou,Chris Emezue,Christopher Klamm,Colin Leong,Daniel van Strien,David Ifeoluwa Adelani,Dragomir Radev,Eduardo González Ponferrada,Efrat Levkovizh,Ethan Kim,Eyal Bar Natan,Francesco De Toni,Gérard Dupont,Germán Kruszewski,Giada Pistilli,Hady Elsahar,Hamza Benyamina,Hieu Tran,Ian Yu,Idris Abdulmumin,Isaac Johnson,Itziar Gonzalez-Dios,Javier de la Rosa,Jenny Chim,Jesse Dodge,Jian Zhu,Jonathan Chang,J?rg Frohberg,Joseph Tobing,Joydeep Bhattacharjee,Khalid Almubarak,Kimbo Chen,Kyle Lo,Leandro Von Werra,Leon Weber,Long Phan,Loubna Ben allal,Ludovic Tanguy,Manan Dey,Manuel Romero Mu?oz,Maraim Masoud,María Grandury,Mario ?a?ko,Max Huang,Maximin Coavoux,Mayank Singh,Mike Tian-Jian Jiang,Minh Chien Vu,Mohammad A. Jauhar,Mustafa Ghaleb,Nishant Subramani,Nora Kassner,Nurulaqilla Khamis,Olivier Nguyen,Omar Espejel,Ona de Gibert,Paulo Villegas,Peter Henderson,Pierre Colombo,Priscilla Amuok,Quentin Lhoest,Rheza Harliman,Rishi Bommasani,Roberto Luis López,Rui Ribeiro,Salomey Osei,Sampo Pyysalo,Sebastian Nagel,Shamik Bose,Shamsuddeen Hassan Muhammad,Shanya Sharma,Shayne Longpre,Somaieh Nikpoor,Stanislav Silberberg,Suhas Pai,Sydney Zink,Tiago Timponi Torrent,Timo Schick,Tristan Thrush,Valentin Danchev,Vassilina Nikoulina,Veronika Laippala,Violette Lepercq,Vrinda Prabhu,Zaid Alyafeai,Zeerak Talat,Arun Raja,Benjamin Heinzerling,Chenglei Si,Davut Emre Ta?ar,Elizabeth Salesky,Sabrina J. Mielke,Wilson Y. Lee,Abheesht Sharma,Andrea Santilli,Antoine Chaffin,Arnaud Stiegler,Debajyoti Datta,Eliza Szczechla,Gunjan Chhablani,Han Wang,Harshit Pandey,Hendrik Strobelt,Jason Alan Fries,Jos Rozen,Leo Gao,Lintang Sutawika,M Saiful Bari,Maged S. Al-shaibani,Matteo Manica,Nihal Nayak,Ryan Teehan,Samuel Albanie,Sheng Shen,Srulik Ben-David,Stephen H. Bach,Taewoon Kim,Tali Bers,Thibault Fevry,Trishala Neeraj,Urmish Thakker,Vikas Raunak,Xiangru Tang,Zheng-Xin Yong,Zhiqing Sun,Shaked Brody,Yallow Uri,Hadar Tojarieh,Adam Roberts,Hyung Won Chung,Jaesung Tae,Jason Phang,Ofir Press,Conglong Li,Deepak Narayanan,Hatim Bourfoune,Jared Casper,Jeff Rasley,Max Ryabinin,Mayank Mishra,Minjia Zhang,Mohammad Shoeybi,Myriam Peyrounette,Nicolas Patry,Nouamane Tazi,Omar Sanseviero,Patrick von Platen,Pierre Cornette,Pierre Fran?ois Lavallée,Rémi Lacroix,Samyam Rajbhandari,Sanchit Gandhi,Shaden Smith,Stéphane Requena,Suraj Patil,Tim Dettmers,Ahmed Baruwa,Amanpreet Singh,Anastasia Cheveleva,Anne-Laure Ligozat,Arjun Subramonian,Aurélie Névéol,Charles Lovering,Dan Garrette,Deepak Tunuguntla,Ehud Reiter,Ekaterina Taktasheva,Ekaterina Voloshina,Eli Bogdanov,Genta Indra Winata,Hailey Schoelkopf,Jan-Christoph Kalo,Jekaterina Novikova,Jessica Zosa Forde,Jordan Clive,Jungo Kasai,Ken Kawamura,Liam Hazan,Marine Carpuat,Miruna Clinciu,Najoung Kim,Newton Cheng,Oleg Serikov,Omer Antverg,Oskar van der Wal,Rui Zhang,Ruochen Zhang,Sebastian Gehrmann,Shachar Mirkin,Shani Pais,Tatiana Shavrina,Thomas Scialom,Tian Yun,Tomasz Limisiewicz,Verena Rieser,Vitaly Protasov,Vladislav Mikhailov,Yada Pruksachatkun,Yonatan Belinkov,Zachary Bamberger,Zdeněk Kasner,Alice Rueda,Amanda Pestana,Amir Feizpour,Ammar Khan,Amy Faranak,Ana Santos,Anthony Hevia,Antigona Unldreaj,Arash Aghagol,Arezoo Abdollahi,Aycha Tammour,Azadeh HajiHosseini,Bahareh Behroozi,Benjamin Ajibade,Bharat Saxena,Carlos Mu?oz Ferrandis,Danish Contractor,David Lansky,Davis David,Douwe Kiela,Duong A. Nguyen,Edward Tan,Emi Baylor,Ezinwanne Ozoani,Fatima Mirza,Frankline Ononiwu,Habib Rezanejad,Hessie Jones,Indrani Bhattacharya,Irene Solaiman,Irina Sedenko,Isar Nejadgholi,Jesse Passmore,Josh Seltzer,Julio Bonis Sanz,Livia Dutra,Mairon Samagaio,Maraim Elbadri,Margot Mieskes,Marissa Gerchick,Martha Akinlolu,Michael McKenna,Mike Qiu,Muhammed Ghauri,Mykola Burynok,Nafis Abrar,Nazneen Rajani,Nour Elkott,Nour Fahmy,Olanrewaju Samuel,Ran An,Rasmus Kromann,Ryan Hao,Samira Alizadeh,Sarmad Shubber,Silas Wang,Sourav Roy,Sylvain Viguier,Thanh Le,Tobi Oyebade,Trieu Le,Yoyo Yang,Zach Nguyen,Abhinav Ramesh Kashyap,Alfredo Palasciano,Alison Callahan,Anima Shukla,Antonio Miranda-Escalada,Ayush Singh,Benjamin Beilharz,Bo Wang,Caio Brito,Chenxi Zhou,Chirag Jain,Chuxin Xu,Clémentine Fourrier,Daniel León Peri?án,Daniel Molano,Dian Yu,Enrique Manjavacas,Fabio Barth,Florian Fuhrimann,Gabriel Altay,Giyaseddin Bayrak,Gully Burns,Helena U. Vrabec,Imane Bello,Ishani Dash,Jihyun Kang,John Giorgi,Jonas Golde,Jose David Posada,Karthik Rangasai Sivaraman,Lokesh Bulchandani,Lu Liu,Luisa Shinzato,Madeleine Hahn de Bykhovetz,Maiko Takeuchi,Marc Pàmies,Maria A Castillo,Marianna Nezhurina,Mario S?nger,Matthias Samwald,Michael Cullan,Michael Weinberg,Michiel De Wolf,Mina Mihaljcic,Minna Liu,Moritz Freidank,Myungsun Kang,Natasha Seelam,Nathan Dahlberg,Nicholas Michio Broad,Nikolaus Muellner,Pascale Fung,Patrick Haller,Ramya Chandrasekhar,Renata Eisenberg,Robert Martin,Rodrigo Canalli,Rosaline Su,Ruisi Su,Samuel Cahyawijaya,Samuele Garda,Shlok S Deshmukh,Shubhanshu Mishra,Sid Kiblawi,Simon Ott,Sinee Sang-aroonsiri,Srishti Kumar,Stefan Schweter,Sushil Bharati,Tanmay Laud,Théo Gigant,Tomoya Kainuma,Wojciech Kusa,Yanis Labrak,Yash Shailesh Bajaj,Yash Venkatraman,Yifan Xu,Yingxin Xu,Yu Xu,Zhe Tan,Zhongli Xie,Zifan Ye,Mathilde Bras,Younes Belkada,Thomas Wolf

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

相關內容

語言模型化

關注 9

INTERACT · 機器人 · 可約的 · Use Case · SCAN ·

2023 年 2 月 13 日

On the Importance of Patient Acceptance for Medical Robotic Imaging

Christine Eilers,Rob van Kemenade,Benjamin Busam,Nassir Navab

from arxiv, Under submission for IPCAI/IJCARS 2023

Purpose: Mutual acceptance is required for any human-to-human interaction. Therefore, one would assume that this also holds for robot-patient interactions. However, the medical robotic imaging field lacks research in the area of acceptance. This work, therefore, aims at analyzing the influence of robot-patient interactions on acceptance in an exemplary medical robotic imaging system. Methods: We designed an interactive human-robot scenario, including auditive and gestural cues, and compared this pipeline to a non-interactive scenario. Both scenarios were evaluated through a questionnaire to measure acceptance. Heart rate monitoring was also used to measure stress. The impact of the interaction was quantified in the use case of robotic ultrasound scanning of the neck. Results: We conducted the first user study on patient acceptance of robotic ultrasound. Results show that verbal interactions impacts trust more than gestural ones. Furthermore, through interaction, the robot is perceived to be friendlier. The heart rate data indicates that robot-patient interaction could reduce stress. Conclusion: Robot-patient interactions are crucial for improving acceptance in medical robotic imaging systems. While verbal interaction is most important, the preferred interaction type and content are participant-dependent. Heart rate values indicate that such interactions can also reduce stress. Overall, this initial work showed that interactions improve patient acceptance in medical robotic imaging, and other medical robot-patient systems can benefit from the design proposals to enhance acceptance in interactive scenarios.

縮放 · 變換 · Vision · Performer · MoDELS ·

2023 年 2 月 10 日

Scaling Vision Transformers to 22 Billion Parameters

Mostafa Dehghani,Josip Djolonga,Basil Mustafa,Piotr Padlewski,Jonathan Heek,Justin Gilmer,Andreas Steiner,Mathilde Caron,Robert Geirhos,Ibrahim Alabdulmohsin,Rodolphe Jenatton,Lucas Beyer,Michael Tschannen,Anurag Arnab,Xiao Wang,Carlos Riquelme,Matthias Minderer,Joan Puigcerver,Utku Evci,Manoj Kumar,Sjoerd van Steenkiste,Gamaleldin F. Elsayed,Aravindh Mahendran,Fisher Yu,Avital Oliver,Fantine Huot,Jasmijn Bastings,Mark Patrick Collier,Alexey Gritsenko,Vighnesh Birodkar,Cristina Vasconcelos,Yi Tay,Thomas Mensink,Alexander Kolesnikov,Filip Paveti?,Dustin Tran,Thomas Kipf,Mario Lu?i?,Xiaohua Zhai,Daniel Keysers,Jeremiah Harmsen,Neil Houlsby

The scaling of Transformers has driven breakthrough capabilities for language models. At present, the largest large language models (LLMs) contain upwards of 100B parameters. Vision Transformers (ViT) have introduced the same architecture to image and video modelling, but these have not yet been successfully scaled to nearly the same degree; the largest dense ViT contains 4B parameters (Chen et al., 2022). We present a recipe for highly efficient and stable training of a 22B-parameter ViT (ViT-22B) and perform a wide variety of experiments on the resulting model. When evaluated on downstream tasks (often with a lightweight linear model on frozen features), ViT-22B demonstrates increasing performance with scale. We further observe other interesting benefits of scale, including an improved tradeoff between fairness and performance, state-of-the-art alignment to human visual perception in terms of shape/texture bias, and improved robustness. ViT-22B demonstrates the potential for "LLM-like" scaling in vision, and provides key steps towards getting there.

推斷 · 知識 (knowledge) · 語言模型化 · MoDELS · 控制器 ·

2023 年 2 月 10 日

Adversarial Transformer Language Models for Contextual Commonsense Inference

Pedro Colon-Hernandez,Henry Lieberman,Yida Xin,Claire Yin,Cynthia Breazeal,Peter Chin

from arxiv, Submitted to Semantic Web Journal special edition. //semantic-web-journal.org/content/adversarial-transformer-language-models-contextual-commonsense-inference-1

Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions (i.e., facts) from a given story, and a particular sentence from that story. Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training; and, possibly, hallucinated or false facts. In this work, we utilize a transformer model for this task and develop techniques to address the aforementioned problems in the task. We control the inference by introducing a new technique we call "hinting". Hinting is a kind of language model prompting, that utilizes both hard prompts (specific words) and soft prompts (virtual learnable templates). This serves as a control signal to advise the language model "what to talk about". Next, we establish a methodology for performing joint inference with multiple commonsense knowledge bases. Joint inference of commonsense requires care, because it is imprecise and the level of generality is more flexible. You want to be sure that the results "still make sense" for the context. To this end, we align the textual version of assertions from three knowledge graphs (ConceptNet, ATOMIC2020, and GLUCOSE) with a story and a target sentence. This combination allows us to train a single model to perform joint inference with multiple knowledge graphs. We show experimental results for the three knowledge graphs on joint inference. Our final contribution is exploring a GAN architecture that generates the contextualized commonsense assertions and scores them as to their plausibility through a discriminator. The result is an integrated system for contextual commonsense inference in stories, that can controllably generate plausible commonsense assertions, and takes advantage of joint inference between multiple commonsense knowledge bases.

Performer · MoDELS · 語言模型化 · Better · CASE ·

2023 年 2 月 10 日

Translating Natural Language to Planning Goals with Large-Language Models

Yaqi Xie,Chen Yu,Tongyao Zhu,Jinbin Bai,Ze Gong,Harold Soh

Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains. Unfortunately, recent work has also shown that LLMs are unable to perform accurate reasoning nor solve planning problems, which may limit their usefulness for robotics-related tasks. In this work, our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language. If so, LLM can act as a natural interface between the planner and human users; the translated goal can be handed to domain-independent AI planners that are very effective at planning. Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning. We find that LLMs are able to leverage commonsense knowledge and reasoning to furnish missing details from under-specified goals (as is often the case in natural language). However, our experiments also reveal that LLMs can fail to generate goals in tasks that involve numerical or physical (e.g., spatial) reasoning, and that LLMs are sensitive to the prompts used. As such, these models are promising for translation to structured planning languages, but care should be taken in their use.

語言模型化 · TOOLS · MoDELS · Performer · SimPLe ·

2023 年 2 月 9 日

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick,Jane Dwivedi-Yu,Roberto Dessì,Roberta Raileanu,Maria Lomeli,Luke Zettlemoyer,Nicola Cancedda,Thomas Scialom

Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. This is done in a self-supervised way, requiring nothing more than a handful of demonstrations for each API. We incorporate a range of tools, including a calculator, a Q\&A system, two different search engines, a translation system, and a calendar. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.

MoDELS · 變換 · 知識 (knowledge) · Processing（編程語言） · Performer ·

2023 年 2 月 9 日

Lightweight Transformers for Clinical Natural Language Processing

Omid Rohanian,Mohammadmahdi Nouriborji,Hannah Jauncey,Samaneh Kouchaki,ISARIC Clinical Characterisation Group,Lei Clifton,Laura Merson,David A. Clifton

Specialised pre-trained language models are becoming more frequent in NLP since they can potentially outperform models trained on generic texts. BioBERT and BioClinicalBERT are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like Knowledge Distillation (KD), it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from 15 million to 65 million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including Natural Language Inference, Relation Extraction, Named Entity Recognition, and Sequence Classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at //huggingface.co/nlpie and Github page at //github.com/nlpie-research/Lightweight-Clinical-Transformers, respectively, promoting reproducibility of our results.

MoDELS · 學成 · Next · Processing（編程語言） · Taxonomy ·

2021 年 8 月 12 日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Katikapalli Subramanyam Kalyan,Ajit Rajasekharan,Sivanesan Sangeetha

from arxiv, Preprint under review

Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. The evolution of these models started with GPT and BERT. These models are built on the top of transformers, self-supervised learning and transfer learning. Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning and transfer this knowledge to downstream tasks. These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch. In this comprehensive survey paper, we initially give a brief overview of self-supervised learning. Next, we explain various core concepts like pretraining, pretraining methods, pretraining tasks, embeddings and downstream adaptation methods. Next, we present a new taxonomy of T-PTLMs and then give brief overview of various benchmarks including both intrinsic and extrinsic. We present a summary of various useful libraries to work with T-PTLMs. Finally, we highlight some of the future research directions which will further improve these models. We strongly believe that this comprehensive survey paper will serve as a good reference to learn the core concepts as well as to stay updated with the recent happenings in T-PTLMs.

Prompt · 語言模型化 · MoDELS · Performer · Processing（編程語言） ·

2021 年 7 月 28 日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Pengfei Liu,Weizhe Yuan,Jinlan Fu,Zhengbao Jiang,Hiroaki Hayashi,Graham Neubig

from arxiv, Website: //pretrain.nlpedia.ai/

This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning". Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly. To use these models to perform prediction tasks, the original input x is modified using a template into a textual string prompt x' that has some unfilled slots, and then the language model is used to probabilistically fill the unfilled information to obtain a final string x, from which the final output y can be derived. This framework is powerful and attractive for a number of reasons: it allows the language model to be pre-trained on massive amounts of raw text, and by defining a new prompting function the model is able to perform few-shot or even zero-shot learning, adapting to new scenarios with few or no labeled data. In this paper we introduce the basics of this promising paradigm, describe a unified set of mathematical notations that can cover a wide variety of existing work, and organize existing work along several dimensions, e.g.the choice of pre-trained models, prompts, and tuning strategies. To make the field more accessible to interested beginners, we not only make a systematic review of existing works and a highly structured typology of prompt-based concepts, but also release other resources, e.g., a website //pretrain.nlpedia.ai/ including constantly-updated survey, and paperlist.

聯邦學習 · 學成 · Processing（編程語言） · MoDELS · 語言模型化 ·

2021 年 7 月 27 日

Federated Learning Meets Natural Language Processing: A Survey

Ming Liu,Stella Ho,Mengqi Wang,Longxiang Gao,Yuan Jin,He Zhang

from arxiv, 19 pages

Federated Learning aims to learn machine learning models from multiple decentralized edge devices (e.g. mobiles) or servers without sacrificing local data privacy. Recent Natural Language Processing techniques rely on deep learning and large pre-trained language models. However, both big deep neural and language models are trained with huge amounts of data which often lies on the server side. Since text data is widely originated from end users, in this work, we look into recent NLP models and techniques which use federated learning as the learning framework. Our survey discusses major challenges in federated natural language processing, including the algorithm challenges, system challenges as well as the privacy issues. We also provide a critical review of the existing Federated NLP evaluation methods and tools. Finally, we highlight the current research gaps and future directions.

語言模型化 · MoDELS · 位置嵌入 · 自編碼器 · 掩碼 ·

2020 年 2 月 28 日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao,Li Dong,Furu Wei,Wenhui Wang,Nan Yang,Xiaodong Liu,Yu Wang,Songhao Piao,Jianfeng Gao,Ming Zhou,Hsiao-Wuen Hon

from arxiv, 11 pages

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.