亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<form id='c6Zqg'></form>

<bdo id='v9bjT'><sup id='mxXm6'><div id='58xY2'><bdo id='pcQSO'></bdo></div></sup></bdo>

·

Analysis · entity · Taxonomy · 命名實體識別 · 語言模型化 ·

2023 年 8 月 23 日

Evolution of ESG-focused DLT Research: An NLP Analysis of the Literature

Walter Hernandez,Kamil Tylinski,Alastair Moore,Niall Roche,Nikhil Vadgama,Horst Treiblmaier,Jiangbo Shangguan,Paolo Tasca,Jiahua Xu

Distributed Ledger Technologies (DLTs) have rapidly evolved, necessitating comprehensive insights into their diverse components. However, a systematic literature review that emphasizes the Environmental, Sustainability, and Governance (ESG) components of DLT remains lacking. To bridge this gap, we selected 107 seed papers to build a citation network of 63,083 references and refined it to a corpus of 24,539 publications for analysis. Then, we labeled the named entities in 46 papers according to twelve top-level categories derived from an established technology taxonomy and enhanced the taxonomy by pinpointing DLT's ESG elements. Leveraging transformer-based language models, we fine-tuned a pre-trained language model for a Named Entity Recognition (NER) task using our labeled dataset. We used our fine-tuned language model to distill the corpus to 505 key papers, facilitating a literature review via named entities and temporal graph analysis on DLT evolution in the context of ESG. Our contributions are a methodology to conduct a machine learning-driven systematic literature review in the DLT field, placing a special emphasis on ESG aspects. Furthermore, we present a first-of-its-kind NER dataset, composed of 54,808 named entities, designed for DLT and ESG-related explorations.

相關內容

Analysis

標注 · 相關系數 · 有偏 · 文本分類 · INFORMS ·

2023 年 10 月 11 日

Accurate Use of Label Dependency in Multi-Label Text Classification Through the Lens of Causality

Caoyun Fan,Wenqing Chen,Jidong Tian,Yitian Li,Hao He,Yaohui Jin

from arxiv, Applied Intelligence 2023

Multi-Label Text Classification (MLTC) aims to assign the most relevant labels to each given text. Existing methods demonstrate that label dependency can help to improve the model's performance. However, the introduction of label dependency may cause the model to suffer from unwanted prediction bias. In this study, we attribute the bias to the model's misuse of label dependency, i.e., the model tends to utilize the correlation shortcut in label dependency rather than fusing text information and label dependency for prediction. Motivated by causal inference, we propose a CounterFactual Text Classifier (CFTC) to eliminate the correlation bias, and make causality-based predictions. Specifically, our CFTC first adopts the predict-then-modify backbone to extract precise label information embedded in label dependency, then blocks the correlation shortcut through the counterfactual de-bias technique with the help of the human causal graph. Experimental results on three datasets demonstrate that our CFTC significantly outperforms the baselines and effectively eliminates the correlation bias in datasets.

entity · MoDELS · 可理解性 · 張成子空間 · 語言模型化 ·

2023 年 10 月 10 日

DocumentNet: Bridging the Data Gap in Document Pre-Training

Lijun Yu,Jin Miao,Xiaoyu Sun,Jiayi Chen,Alexander G. Hauptmann,Hanjun Dai,Wei Wei

from arxiv, EMNLP 2023

Document understanding tasks, in particular, Visually-rich Document Entity Retrieval (VDER), have gained significant attention in recent years thanks to their broad applications in enterprise AI. However, publicly available data have been scarce for these tasks due to strict privacy constraints and high annotation costs. To make things worse, the non-overlapping entity spaces from different datasets hinder the knowledge transfer between document types. In this paper, we propose a method to collect massive-scale and weakly labeled data from the web to benefit the training of VDER models. The collected dataset, named DocumentNet, does not depend on specific document types or entity sets, making it universally applicable to all VDER tasks. The current DocumentNet consists of 30M documents spanning nearly 400 document types organized in a four-level ontology. Experiments on a set of broadly adopted VDER tasks show significant improvements when DocumentNet is incorporated into the pre-training for both classic and few-shot learning settings. With the recent emergence of large language models (LLMs), DocumentNet provides a large data source to extend their multi-modal capabilities for VDER.

MoDELS · Analysis · INFORMS · Performer · 語言模型化 ·

2023 年 10 月 10 日

Evaluation and Analysis of Hallucination in Large Vision-Language Models

Junyang Wang,Yiyang Zhou,Guohai Xu,Pengcheng Shi,Chenlin Zhao,Haiyang Xu,Qinghao Ye,Ming Yan,Ji Zhang,Jihua Zhu,Jitao Sang,Haoyu Tang

from arxiv, 11 pages, 5 figures

Large Vision-Language Models (LVLMs) have recently achieved remarkable success. However, LVLMs are still plagued by the hallucination problem, which limits the practicality in many scenarios. Hallucination refers to the information of LVLMs' responses that does not exist in the visual input, which poses potential risks of substantial consequences. There has been limited work studying hallucination evaluation in LVLMs. In this paper, we propose Hallucination Evaluation based on Large Language Models (HaELM), an LLM-based hallucination evaluation framework. HaELM achieves an approximate 95% performance comparable to ChatGPT and has additional advantages including low cost, reproducibility, privacy preservation and local deployment. Leveraging the HaELM, we evaluate the hallucination in current LVLMs. Furthermore, we analyze the factors contributing to hallucination in LVLMs and offer helpful suggestions to mitigate the hallucination problem. Our training data and human annotation hallucination data will be made public soon.

上下文窗口 · Microsoft Windows · Extensibility · 有偏 · MoDELS ·

2023 年 10 月 10 日

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

Dawei Zhu,Nan Yang,Liang Wang,Yifan Song,Wenhao Wu,Furu Wei,Sujian Li

Large Language Models (LLMs) are trained with a pre-defined context length, restricting their use in scenarios requiring long inputs. Previous efforts for adapting LLMs to a longer length usually requires fine-tuning with this target length (Full-length fine-tuning), suffering intensive training cost. To decouple train length from target length for efficient context window extension, we propose Positional Skip-wisE (PoSE) training that smartly simulates long inputs using a fixed context window. This is achieved by first dividing the original context window into several chunks, then designing distinct skipping bias terms to manipulate the position indices of each chunk. These bias terms and the lengths of each chunk are altered for every training example, allowing the model to adapt to all positions within target length. Experimental results show that PoSE greatly reduces memory and time overhead compared with Full-length fine-tuning, with minimal impact on performance. Leveraging this advantage, we have successfully extended the LLaMA model to 128k tokens using a 2k training context window. Furthermore, we empirically confirm that PoSE is compatible with all RoPE-based LLMs and position interpolation strategies. Notably, our method can potentially support infinite length, limited only by memory usage in inference. With ongoing progress for efficient inference, we believe PoSE can further scale the context window beyond 128k.

Agent · 回合 · MoDELS · state-of-the-art · 語言模型化 ·

2023 年 10 月 9 日

Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena

Jiangjie Chen,Siyu Yuan,Rong Ye,Bodhisattwa Prasad Majumder,Kyle Richardson

from arxiv, Preprint

Can Large Language Models (LLMs) simulate human behavior in complex environments? LLMs have recently been shown to exhibit advanced reasoning skills but much of NLP evaluation still relies on static benchmarks. Answering this requires evaluation environments that probe strategic reasoning in competitive, dynamic scenarios that involve long-term planning. We introduce AucArena, a novel simulation environment for evaluating LLMs within auctions, a setting chosen for being highly unpredictable and involving many skills related to resource and risk management, while also being easy to evaluate. We conduct several controlled simulations using state-of-the-art LLMs as bidding agents. We find that through simple prompting, LLMs do indeed demonstrate many of the skills needed for effectively engaging in auctions (e.g., managing budget, adhering to long-term goals and priorities), skills that we find can be sharpened by explicitly encouraging models to be adaptive and observe strategies in past auctions. These results are significant as they show the potential of using LLM agents to model intricate social dynamics, especially in competitive settings. However, we also observe considerable variability in the capabilities of individual LLMs. Notably, even our most advanced models (GPT-4) are occasionally surpassed by heuristic baselines and human agents, highlighting the potential for further improvements in the design of LLM agents and the important role that our simulation environment can play in further testing and refining agent architectures.

tuning · 多樣性 · Seven · MoDELS · 語言模型化 ·

2023 年 10 月 9 日

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

Yue Zhang,Leyang Cui,Deng Cai,Xinting Huang,Tao Fang,Wei Bi

Proprietary Large Language Models (LLMs), such as ChatGPT, have garnered significant attention due to their exceptional capabilities in handling a diverse range of tasks. Recent studies demonstrate that open-sourced smaller foundational models, such as 7B-size LLaMA, can also display remarkable proficiency in tackling diverse tasks when fine-tuned using instruction-driven data. In this work, we investigate a practical problem setting where the primary focus is on one or a few particular tasks rather than general-purpose instruction following, and explore whether LLMs can be beneficial and further improved for such targeted scenarios. We choose the writing-assistant scenario as the testbed, which includes seven writing tasks. We collect training data for these tasks, reframe them in an instruction-following format, and subsequently refine the LLM, specifically LLaMA, via instruction tuning. Experimental results show that fine-tuning LLaMA on writing instruction data significantly improves its ability on writing tasks. We also conduct more experiments and analyses to offer insights for future work on effectively fine-tuning LLaMA for specific scenarios. Finally, we initiate a discussion regarding the necessity of employing LLMs for only one targeted task, taking into account the efforts required for tuning and the resources consumed during deployment.

Analysis · 情感分析 · MoDELS · 語言模型化 · AI ·

2023 年 10 月 9 日

Quality Assurance of A GPT-based Sentiment Analysis System: Adversarial Review Data Generation and Detection

Tinghui Ouyang,Hoang-Quoc Nguyen-Son,Huy H. Nguyen,Isao Echizen,Yoshiki Seo

Large Language Models (LLMs) have been garnering significant attention of AI researchers, especially following the widespread popularity of ChatGPT. However, due to LLMs' intricate architecture and vast parameters, several concerns and challenges regarding their quality assurance require to be addressed. In this paper, a fine-tuned GPT-based sentiment analysis model is first constructed and studied as the reference in AI quality analysis. Then, the quality analysis related to data adequacy is implemented, including employing the content-based approach to generate reasonable adversarial review comments as the wrongly-annotated data, and developing surprise adequacy (SA)-based techniques to detect these abnormal data. Experiments based on Amazon.com review data and a fine-tuned GPT model were implemented. Results were thoroughly discussed from the perspective of AI quality assurance to present the quality analysis of an LLM model on generated adversarial textual data and the effectiveness of using SA on anomaly detection in data quality assurance.

優化器 · 平穩的 · Networking · Neural Networks · 易處理的 ·

2023 年 10 月 8 日

Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods

Constantine Caramanis,Dimitris Fotakis,Alkis Kalavasis,Vasilis Kontonis,Christos Tzamos

Deep Neural Networks and Reinforcement Learning methods have empirically shown great promise in tackling challenging combinatorial problems. In those methods a deep neural network is used as a solution generator which is then trained by gradient-based methods (e.g., policy gradient) to successively obtain better solution distributions. In this work we introduce a novel theoretical framework for analyzing the effectiveness of such methods. We ask whether there exist generative models that (i) are expressive enough to generate approximately optimal solutions; (ii) have a tractable, i.e, polynomial in the size of the input, number of parameters; (iii) their optimization landscape is benign in the sense that it does not contain sub-optimal stationary points. Our main contribution is a positive answer to this question. Our result holds for a broad class of combinatorial problems including Max- and Min-Cut, Max-$k$-CSP, Maximum-Weight-Bipartite-Matching, and the Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla gradient descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.

噪聲 · 有偏 · 相互獨立的 · 模型評估 · 可約的 ·

2023 年 10 月 8 日

Evaluating Bias and Noise Induced by the U.S. Census Bureau's Privacy Protection Methods

Christopher T. Kenny,Shiro Kuriwaki,Cory McCartan,Tyler Simko,Kosuke Imai

from arxiv, 23 pages, 6 figures, 2 tables, plus appendices

The United States Census Bureau faces a difficult trade-off between the accuracy of Census statistics and the protection of individual information. We conduct the first independent evaluation of bias and noise induced by the Bureau's two main disclosure avoidance systems: the TopDown algorithm employed for the 2020 Census and the swapping algorithm implemented for the three previous Censuses. Our evaluation leverages the Noisy Measure File (NMF) as well as two independent runs of the TopDown algorithm applied to the 2010 decennial Census. We find that the NMF contains too much noise to be directly useful, especially for Hispanic and multiracial populations. TopDown's post-processing dramatically reduces the NMF noise and produces data whose accuracy is similar to that of swapping. While the estimated errors for both TopDown and swapping algorithms are generally no greater than other sources of Census error, they can be relatively substantial for geographies with small total populations.

TOOLS · 可辨認的 · INFORMS · 查準率/準確率 · 語言模型化 ·

2023 年 10 月 6 日

From Text to Self: Users' Perceptions of Potential of AI on Interpersonal Communication and Self

Yue Fu,Sami Foell,Xuhai Xu,Alexis Hiniker

In the rapidly evolving landscape of AI-mediated communication (AIMC), tools powered by Large Language Models (LLMs) are becoming integral to interpersonal communication. Employing a mixed-methods approach, we conducted a one-week diary and interview study to explore users' perceptions of these tools' ability to: 1) support interpersonal communication in the short-term, and 2) lead to potential long-term effects. Our findings indicate that participants view AIMC support favorably, citing benefits such as increased communication confidence, and finding precise language to express their thoughts, navigating linguistic and cultural barriers. However, the study also uncovers current limitations of AIMC tools, including verbosity, unnatural responses, and excessive emotional intensity. These shortcomings are further exacerbated by user concerns about inauthenticity and potential overreliance on the technology. Furthermore, we identified four key communication spaces delineated by communication stakes (high or low) and relationship dynamics (formal or informal) that differentially predict users' attitudes toward AIMC tools. Specifically, participants found the tool is more suitable for communicating in formal relationships than informal ones and more beneficial in high-stakes than low-stakes communication.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

命名實體識別

語言模型化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='dkvsq'></tfoot>

<legend id='dkvsq'><style id='dkvsq'><dir id='dkvsq'><q id='dkvsq'></q></dir></style></legend>

<i id='dkvsq'><tr id='dkvsq'><dt id='dkvsq'><q id='dkvsq'><span id='dkvsq'><b id='dkvsq'><form id='dkvsq'><ins id='dkvsq'></ins><ul id='dkvsq'></ul><sub id='dkvsq'></sub></form><legend id='dkvsq'></legend><bdo id='dkvsq'><pre id='dkvsq'><center id='dkvsq'></center></pre></bdo></b><th id='dkvsq'></th></span></q></dt></tr></i><div id='dkvsq'><tfoot id='dkvsq'></tfoot><dl id='dkvsq'><fieldset id='dkvsq'></fieldset></dl></div>