丰满人妻被公侵犯高清版,久久国产高清最新地址,国产黄色视频99VR,国产精品久久国产精无码,日本欧美一区二区三区在线观看

Chain-of-thought prompting~(CoT) and tool augmentation have been validated in recent work as effective practices for improving large language models~(LLMs) to perform step-by-step reasoning on complex math-related tasks. However, most existing math reasoning datasets may be not able to fully evaluate and analyze the ability of LLMs in manipulating tools and performing reasoning, as they may only require very few invocations of tools or miss annotations for evaluating intermediate reasoning steps. To address the issue, we construct \textbf{CARP}, a new Chinese dataset consisting of 4,886 computation-intensive algebra problems with formulated annotations on intermediate steps. In CARP, we test four LLMs with CoT prompting, and find that they are all prone to make mistakes at the early steps of the solution, leading to wrong answers. Based on this finding, we propose a new approach that can deliberate the reasoning steps with tool interfaces, namely \textbf{DELI}. In DELI, we first initialize a step-by-step solution based on retrieved exemplars, then iterate two deliberation procedures that check and refine the intermediate steps of the generated solution, from the perspectives of tool manipulation and natural language reasoning, until obtaining converged solutions or reaching the maximum turn. Experimental results on CARP and six other datasets show that the proposed DELI mostly outperforms competitive baselines, and can further boost the performance of existing CoT methods. Our data and code are available in \url{//github.com/RUCAIBox/CARP}.

相關內容

Performer

關注 10

MoDELS · 語言模型化 · 道德化 · Performer · 知識 (knowledge) ·

2023 年 7 月 25 日

Revision Transformers: Instructing Language Models to Change their Values

Felix Friedrich,Wolfgang Stammer,Patrick Schramowski,Kristian Kersting

Current transformer language models (LM) are large-scale models with billions of parameters. They have been shown to provide high performances on a variety of tasks but are also prone to shortcut learning and bias. Addressing such incorrect model behavior via parameter adjustments is very costly. This is particularly problematic for updating dynamic concepts, such as moral values, which vary culturally or interpersonally. In this work, we question the current common practice of storing all information in the model parameters and propose the Revision Transformer (RiT) to facilitate easy model updating. The specific combination of a large-scale pre-trained LM that inherently but also diffusely encodes world knowledge with a clear-structured revision engine makes it possible to update the model's knowledge with little effort and the help of user interaction. We exemplify RiT on a moral dataset and simulate user feedback demonstrating strong performance in model revision even with small data. This way, users can easily design a model regarding their preferences, paving the way for more transparent AI models.

CoT · Performer · MoDELS · ForCES · Continuity ·

2023 年 7 月 25 日

Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

Ansh Radhakrishnan,Karina Nguyen,Anna Chen,Carol Chen,Carson Denison,Danny Hernandez,Esin Durmus,Evan Hubinger,Jackson Kernion,Kamil? Luko?iūt?,Newton Cheng,Nicholas Joseph,Nicholas Schiefer,Oliver Rausch,Sam McCandlish,Sheer El Showk,Tamera Lanham,Tim Maxwell,Venkatesa Chandrasekaran,Zac Hatfield-Dodds,Jared Kaplan,Jan Brauner,Samuel R. Bowman,Ethan Perez

from arxiv, For few-shot examples and prompts, see //github.com/anthropics/DecompositionFaithfulnessPaper

As large language models (LLMs) perform more difficult tasks, it becomes harder to verify the correctness and safety of their behavior. One approach to help with this issue is to prompt LLMs to externalize their reasoning, e.g., by having them generate step-by-step reasoning as they answer a question (Chain-of-Thought; CoT). The reasoning may enable us to check the process that models use to perform tasks. However, this approach relies on the stated reasoning faithfully reflecting the model's actual reasoning, which is not always the case. To improve over the faithfulness of CoT reasoning, we have models generate reasoning by decomposing questions into subquestions. Decomposition-based methods achieve strong performance on question-answering tasks, sometimes approaching that of CoT while improving the faithfulness of the model's stated reasoning on several recently-proposed metrics. By forcing the model to answer simpler subquestions in separate contexts, we greatly increase the faithfulness of model-generated reasoning over CoT, while still achieving some of the performance gains of CoT. Our results show it is possible to improve the faithfulness of model-generated reasoning; continued improvements may lead to reasoning that enables us to verify the correctness and safety of LLM behavior.

圖 · MoDELS · 可辨認的 · 異常檢測 · Performer ·

2023 年 7 月 23 日

Improving Generalizability of Graph Anomaly Detection Models via Data Augmentation

Shuang Zhou,Xiao Huang,Ninghao Liu,Fu-Lai Chung,Long-Kai Huang

from arxiv, The updated version is accepted by TKDE 2023. Please refer to arXiv:2306.10534v1

Graph anomaly detection (GAD) is a vital task since even a few anomalies can pose huge threats to benign users. Recent semi-supervised GAD methods, which can effectively leverage the available labels as prior knowledge, have achieved superior performances than unsupervised methods. In practice, people usually need to identify anomalies on new (sub)graphs to secure their business, but they may lack labels to train an effective detection model. One natural idea is to directly adopt a trained GAD model to the new (sub)graph for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issue, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the same graph. It may cause great troubles. In this paper, we base on the phenomenon and propose a general and novel research problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph and unseen testing graph to eliminate potential dangers. Nevertheless, it is a challenging task since only limited labels are available, and the normal background may differ between training and testing data. Accordingly, we propose a data augmentation method named \textit{AugAN} (\uline{Aug}mentation for \uline{A}nomaly and \uline{N}ormal distributions) to enrich training data and boost the generalizability of GAD models. Experiments verify the effectiveness of our method in improving model generalizability.

MoDELS · 知識 (knowledge) · 語言模型化 · 可理解性 · 可辨認的 ·

2023 年 7 月 23 日

CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning Capabilities of Natural Language Models

Xingbo Wang,Renfei Huang,Zhihua Jin,Tianqing Fang,Huamin Qu

from arxiv, This paper is accepted by IEEE VIS, 2023. To appear in IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG). 14 pages, 11 figures

Recently, large pretrained language models have achieved compelling performance on commonsense benchmarks. Nevertheless, it is unclear what commonsense knowledge the models learn and whether they solely exploit spurious patterns. Feature attributions are popular explainability techniques that identify important input concepts for model outputs. However, commonsense knowledge tends to be implicit and rarely explicitly presented in inputs. These methods cannot infer models' implicit reasoning over mentioned concepts. We present CommonsenseVIS, a visual explanatory system that utilizes external commonsense knowledge bases to contextualize model behavior for commonsense question-answering. Specifically, we extract relevant commonsense knowledge in inputs as references to align model behavior with human knowledge. Our system features multi-level visualization and interactive model probing and editing for different concepts and their underlying relations. Through a user study, we show that CommonsenseVIS helps NLP experts conduct a systematic and scalable visual analysis of models' relational reasoning over concepts in different situations.

知識 (knowledge) · 語言模型化 · 自動問答 · MoDELS · INFORMS ·

2023 年 7 月 23 日

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

Ruiyang Ren,Yuhao Wang,Yingqi Qu,Wayne Xin Zhao,Jing Liu,Hao Tian,Hua Wu,Ji-Rong Wen,Haifeng Wang

Knowledge-intensive tasks (e.g., open-domain question answering (QA)) require a substantial amount of factual knowledge and often rely on external information for assistance. Recently, large language models (LLMs) (e.g., ChatGPT), have demonstrated impressive prowess in solving a wide range of tasks with world knowledge, including knowledge-intensive tasks. However, it remains unclear how well LLMs are able to perceive their factual knowledge boundaries, particularly how they behave when incorporating retrieval augmentation. In this study, we present an initial analysis of the factual knowledge boundaries of LLMs and how retrieval augmentation affects LLMs on open-domain QA. Specially, we focus on three primary research questions and analyze them by examining QA performance, priori judgement and posteriori judgement of LLMs. We show evidence that LLMs possess unwavering confidence in their capabilities to respond to questions and the accuracy of their responses. Furthermore, retrieval augmentation proves to be an effective approach in enhancing LLMs' awareness of knowledge boundaries, thereby improving their judgemental abilities. Additionally, we also find that LLMs have a propensity to rely on the provided retrieval results when formulating answers, while the quality of these results significantly impacts their reliance. The code to reproduce this work is available at //github.com/RUCAIBox/LLM-Knowledge-Boundary.

MoDELS · Extensibility · Metamaterial · 可辨認的 · 評論員 ·

2023 年 7 月 23 日

Unravelling the Mechanics of Knitted Fabrics Through Hierarchical Geometric Representation

Xiaoxiao Ding,Vanessa Sanchez,Katia Bertoldi,Chris H. Rycroft

Knitting interloops one-dimensional yarns into three-dimensional fabrics that exhibit behaviours beyond their constitutive materials. How extensibility and anisotropy emerge from the hierarchical organization of yarns into knitted fabrics has long been unresolved. We sought to unravel the mechanical roles of tensile mechanics, assembly and dynamics arising from the yarn level on fabric nonlinearity by developing a yarn-based dynamical model. This physically validated model captures the fundamental mechanical response of knitted fabrics, analogous to flexible metamaterials and biological fiber networks due to geometric nonlinearity within such hierarchical systems. We identify the dictating factors of the mechanics of knitted fabrics, highlighting the previously overlooked but critical effect of pre-tension. Fabric anisotropy originates from observed yarn--yarn rearrangements during alignment dynamics and is topology-dependent. This yarn-based model also provides design flexibility of knitted fabrics to embed functionalities by allowing variation in both geometric configuration and material property. Our hierarchical approach to build up a knitted fabrics computationally modernizes an ancient craft and represents a first step towards mechanical programmability of knitted fabrics in wide engineering applications.

MoDELS · 推斷 · 賭博機/老虎機 · 相互獨立的 · Performer ·

2023 年 7 月 21 日

Advancing Ad Auction Realism: Practical Insights & Modeling Implications

Ming Chen,Sareh Nabi,Marciano Siniscalchi

This paper proposes a learning model of online ad auctions that allows for the following four key realistic characteristics of contemporary online auctions: (1) ad slots can have different values and click-through rates depending on users' search queries, (2) the number and identity of competing advertisers are unobserved and change with each auction, (3) advertisers only receive partial, aggregated feedback, and (4) payment rules are only partially specified. We model advertisers as agents governed by an adversarial bandit algorithm, independent of auction mechanism intricacies. Our objective is to simulate the behavior of advertisers for counterfactual analysis, prediction, and inference purposes. Our findings reveal that, in such richer environments, "soft floors" can enhance key performance metrics even when bidders are drawn from the same population. We further demonstrate how to infer advertiser value distributions from observed bids, thereby affirming the practical efficacy of our approach even in a more realistic auction setting.

MoDELS · AIM · 評論員 · 語言模型化 · 知識 (knowledge) ·

2022 年 12 月 20 日

Towards Reasoning in Large Language Models: A Survey

Jie Huang,Kevin Chen-Chuan Chang

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

知識 (knowledge) · MoDELS · 圖 · 知識圖譜 · AIM ·

2022 年 12 月 12 日

Reasoning over Different Types of Knowledge Graphs: Static, Temporal and Multi-Modal

Ke Liang,Lingyuan Meng,Meng Liu,Yue Liu,Wenxuan Tu,Siwei Wang,Sihang Zhou,Xinwang Liu,Fuchun Sun

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering and recommendation systems, etc. According to the graph types, the existing KGR models can be roughly divided into three categories, \textit{i.e.,} static models, temporal models, and multi-modal models. The early works in this domain mainly focus on static KGR and tend to directly apply general knowledge graph embedding models to the reasoning task. However, these models are not suitable for more complex but practical tasks, such as inductive static KGR, temporal KGR, and multi-modal KGR. To this end, multiple works have been developed recently, but no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the preliminaries, summaries of KGR models, and typical datasets are introduced and discussed consequently. Moreover, we discuss the challenges and potential opportunities. The corresponding open-source repository is shared on GitHub: //github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.

BART · 圖 · MoDELS · 知識圖譜 · 生成模型 ·

2021 年 1 月 21 日

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Ye Liu,Yao Wan,Lifang He,Hao Peng,Philip S. Yu

from arxiv, 10 pages, 7 figures, Appear in AAAI 2021

Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.