国产高清一区二区在线影院_午夜福利视频欧日韩一区二区_久久综合综合久久97色_一级黄片免费观看_朋友的妈妈2观有限中字_国产91流白浆喷水免费观看_欧美日韩一区国产一区

The Fon language, spoken by an average 2 million of people, is a truly low-resourced African language, with a limited online presence, and existing datasets (just to name but a few). Multitask learning is a learning paradigm that aims to improve the generalization capacity of a model by sharing knowledge across different but related tasks: this could be prevalent in very data-scarce scenarios. In this paper, we present the first explorative approach to multitask learning, for model capabilities enhancement in Natural Language Processing for the Fon language. Specifically, we explore the tasks of Named Entity Recognition (NER) and Part of Speech Tagging (POS) for Fon. We leverage two language model heads as encoders to build shared representations for the inputs, and we use linear layers blocks for classification relative to each task. Our results on the NER and POS tasks for Fon, show competitive (or better) performances compared to several multilingual pretrained language models finetuned on single tasks. Additionally, we perform a few ablation studies to leverage the efficiency of two different loss combination strategies and find out that the equal loss weighting approach works best in our case. Our code is open-sourced at //github.com/bonaventuredossou/multitask_fon.

相關內容

Learning

關注 12

Analysis · Rust · 可辨認的 · MoDELS · Aliasing ·

2023 年 10 月 12 日

Yuga: Automatically Detecting Lifetime Annotation Bugs in the Rust Language

Vikram Nitin,Anne Mulhern,Sanjay Arora,Baishakhi Ray

The Rust programming language is becoming increasingly popular among systems programmers due to its efficient performance and robust memory safety guarantees. Rust employs an ownership model to ensure this guarantee by allowing each value to be owned by only one identifier at a time. Additionally, it introduces the concept of borrowing and lifetimes to enable other variables to borrow the values under certain conditions temporarily. Despite its benefits, security vulnerabilities have been reported in Rust projects, often attributed to the use of "unsafe" Rust code. These vulnerabilities, in part, arise from incorrect lifetime annotations on function signatures. However, existing tools fail to detect these bugs, primarily because such bugs are rare, challenging to detect through dynamic analysis, and require explicit memory models. To overcome these limitations, first, we characterize incorrect lifetime annotations as a source of memory safety bugs and leverage this understanding to devise a novel static analysis tool, Yuga, to detect potential lifetime annotation bugs. Yuga uses a multi-phase analysis approach, starting with a quick pattern-matching algorithm to identify potential buggy components and then conducting a flow and field-sensitive alias analysis to confirm the bugs. We also curate new datasets of lifetime annotation bugs. Yuga successfully detects bugs with good precision on these datasets, and we make the code and datasets publicly available for review.

MoDELS · 語言模型化 · GPT-4 · state-of-the-art · HTTPS ·

2023 年 10 月 12 日

AceGPT, Localizing Large Language Models in Arabic

Huang Huang,Fei Yu,Jianqing Zhu,Xuening Sun,Hao Cheng,Dingjie Song,Zhihong Chen,Abdulmohsen Alharthi,Bang An,Ziche Liu,Zhiyi Zhang,Junying Chen,Jianquan Li,Benyou Wang,Lian Zhang,Ruoyu Sun,Xiang Wan,Haizhou Li,Jinchao Xu

from arxiv, //github.com/FreedomIntelligence/AceGPT

This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models. Significant concerns emerge when addressing cultural sensitivity and local values. To address this, the paper proposes a comprehensive solution that includes further pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic, alongside Reinforcement Learning with AI Feedback (RLAIF) employing a reward model attuned to local culture and values. The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities. Comprehensive evaluations reveal that the resulting model, dubbed 'AceGPT', sets the state-of-the-art standard for open Arabic LLMs across various benchmarks, including the instruction-following benchmark (i.e., Arabic Vicuna-80 and Arabic AlpacaEval), knowledge benchmark (i.e., Arabic MMLU and EXAMs), and the newly introduced Arabic Cultural and Value Alignment benchmark. Notably, AceGPT outperforms Turbo in the popular Vicuna-80 benchmark when evaluated with GPT-4, despite the benchmark's limited scale. Codes, data, and models are in //github.com/FreedomIntelligence/AceGPT.

WEB · 在線 · Extensibility · search engine · Pivotal（公司） ·

2023 年 10 月 11 日

Tag Your Fish in the Broken Net: A Responsible Web Framework for Protecting Online Privacy and Copyright

Dawen Zhang,Boming Xia,Yue Liu,Xiwei Xu,Thong Hoang,Zhenchang Xing,Mark Staples,Qinghua Lu,Liming Zhu

The World Wide Web, a ubiquitous source of information, serves as a primary resource for countless individuals, amassing a vast amount of data from global internet users. However, this online data, when scraped, indexed, and utilized for activities like web crawling, search engine indexing, and, notably, AI model training, often diverges from the original intent of its contributors. The ascent of Generative AI has accentuated concerns surrounding data privacy and copyright infringement. Regrettably, the web's current framework falls short in facilitating pivotal actions like consent withdrawal or data copyright claims. While some companies offer voluntary measures, such as crawler access restrictions, these often remain inaccessible to individual users. To empower online users to exercise their rights and enable companies to adhere to regulations, this paper introduces a user-controlled consent tagging framework for online data. It leverages the extensibility of HTTP and HTML in conjunction with the decentralized nature of distributed ledger technology. With this framework, users have the ability to tag their online data at the time of transmission, and subsequently, they can track and request the withdrawal of consent for their data from the data holders. A proof-of-concept system is implemented, demonstrating the feasibility of the framework. This work holds significant potential for contributing to the reinforcement of user consent, privacy, and copyright on the modern internet and lays the groundwork for future insights into creating a more responsible and user-centric web ecosystem.

語言模型化 · MoDELS · 回合 · 表示 · Vision ·

2023 年 10 月 11 日

LangNav: Language as a Perceptual Representation for Navigation

Bowen Pan,Rameswar Panda,SouYoung Jin,Rogerio Feris,Aude Oliva,Phillip Isola,Yoon Kim

We explore the use of language as a perceptual representation for vision-and-language navigation. Our approach uses off-the-shelf vision systems (for image captioning and object detection) to convert an agent's egocentric panoramic view at each time step into natural language descriptions. We then finetune a pretrained language model to select an action, based on the current view and the trajectory history, that would best fulfill the navigation instructions. In contrast to the standard setup which adapts a pretrained language model to work directly with continuous visual features from pretrained vision models, our approach instead uses (discrete) language as the perceptual representation. We explore two use cases of our language-based navigation (LangNav) approach on the R2R vision-and-language navigation benchmark: generating synthetic trajectories from a prompted large language model (GPT-4) with which to finetune a smaller language model; and sim-to-real transfer where we transfer a policy learned on a simulated environment (ALFRED) to a real-world environment (R2R). Our approach is found to improve upon strong baselines that rely on visual features in settings where only a few gold trajectories (10-100) are available, demonstrating the potential of using language as a perceptual representation for navigation tasks.

Automator · 語言模型化 · Prompt · 神經元 · MoDELS ·

2023 年 10 月 11 日

The Importance of Prompt Tuning for Automated Neuron Explanations

Justin Lee,Tuomas Oikarinen,Arjun Chatha,Keng-Chi Chang,Yilan Chen,Tsui-Wei Weng

Recent advances have greatly increased the capabilities of large language models (LLMs), but our understanding of the models and their safety has not progressed as fast. In this paper we aim to understand LLMs deeper by studying their individual neurons. We build upon previous work showing large language models such as GPT-4 can be useful in explaining what each neuron in a language model does. Specifically, we analyze the effect of the prompt used to generate explanations and show that reformatting the explanation prompt in a more natural way can significantly improve neuron explanation quality and greatly reduce computational cost. We demonstrate the effects of our new prompts in three different ways, incorporating both automated and human evaluations.

知識 (knowledge) · 語言模型化 · MoDELS · 圖 · Prompt ·

2023 年 10 月 11 日

PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a Language Model

Tatsuya Ide,Eiki Murata,Daisuke Kawahara,Takato Yamazaki,Shengzhe Li,Kenta Shinzato,Toshinori Sato

Despite the remarkable progress in natural language understanding with pretrained Transformers, neural language models often do not handle commonsense knowledge well. Toward commonsense-aware models, there have been attempts to obtain knowledge, ranging from automatic acquisition to crowdsourcing. However, it is difficult to obtain a high-quality knowledge base at a low cost, especially from scratch. In this paper, we propose PHALM, a method of building a knowledge graph from scratch, by prompting both crowdworkers and a large language model (LLM). We used this method to build a Japanese event knowledge graph and trained Japanese commonsense generation models. Experimental results revealed the acceptability of the built graph and inferences generated by the trained models. We also report the difference in prompting humans and an LLM. Our code, data, and models are available at github.com/nlp-waseda/comet-atomic-ja.

Machine Translation · 模型評估 · 損失 · Weight · 閾值 ·

2023 年 10 月 10 日

Crossing the Threshold: Idiomatic Machine Translation through Retrieval Augmentation and Loss Weighting

Emmy Liu,Aditi Chaudhary,Graham Neubig

from arxiv, EMNLP 2023

Idioms are common in everyday language, but often pose a challenge to translators because their meanings do not follow from the meanings of their parts. Despite significant advances, machine translation systems still struggle to translate idiomatic expressions. We provide a simple characterization of idiomatic translation and related issues. This allows us to conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations. To expand multilingual resources, we compile a dataset of ~4k natural sentences containing idiomatic expressions in French, Finnish, and Japanese. To improve translation of natural idioms, we introduce two straightforward yet effective techniques: the strategic upweighting of training loss on potentially idiomatic sentences, and using retrieval-augmented models. This not only improves the accuracy of a strong pretrained MT model on idiomatic sentences by up to 13% in absolute accuracy, but also holds potential benefits for non-idiomatic sentences.

語言模型化 · MoDELS · state-of-the-art · 數據可用性 · Performance ·

2023 年 10 月 10 日

BRAINTEASER: Lateral Thinking Puzzles for Large Language Models

Yifan Jiang,Filip Ilievski,Kaixin Ma,Zhivar Sourati

The success of language models has inspired the NLP community to attend to tasks that require implicit and complex reasoning, relying on human-like commonsense mechanisms. While such vertical thinking tasks have been relatively popular, lateral thinking puzzles have received little attention. To bridge this gap, we devise BRAINTEASER: a multiple-choice Question Answering task designed to test the model's ability to exhibit lateral thinking and defy default commonsense associations. We design a three-step procedure for creating the first lateral thinking benchmark, consisting of data collection, distractor generation, and generation of adversarial examples, leading to 1,100 puzzles with high-quality annotations. To assess the consistency of lateral reasoning by models, we enrich BRAINTEASER based on a semantic and contextual reconstruction of its questions. Our experiments with state-of-the-art instruction- and commonsense language models reveal a significant gap between human and model performance, which is further widened when consistency across adversarial formats is considered. We make all of our code and data available to stimulate work on developing and evaluating lateral thinking models.

MoDELS · Taxonomy · 語言模型化 · 可理解性 · Performance ·

2023 年 9 月 2 日

Explainability for Large Language Models: A Survey

Haiyan Zhao,Hanjie Chen,Fan Yang,Ninghao Liu,Huiqi Deng,Hengyi Cai,Shuaiqiang Wang,Dawei Yin,Mengnan Du

Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional machine learning models.

損失函數（機器學習） · 學習的學習 · 學成 · entity · 泛函 ·

2019 年 9 月 9 日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Jiawei Wu,Wenhan Xiong,William Yang Wang

from arxiv, 11pages, 5 figures, accepted to EMNLP 2019

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.