人人操人人莫人人草,国产日韩精品在线观看,一级A黄片2019久久

At the staggering pace with which the capabilities of large language models (LLMs) are increasing, creating future-proof evaluation sets to assess their understanding becomes more and more challenging. In this paper, we propose a novel paradigm for evaluating LLMs which leverages the idea that correct world understanding should be consistent across different (Fregean) senses of the same meaning. Accordingly, we measure understanding not in terms of correctness but by evaluating consistency across multiple senses that are generated by the model itself. We showcase our approach by instantiating a test where the different senses are different languages, hence using multilingual self-consistency as a litmus test for the model's understanding and simultaneously addressing the important topic of multilinguality. Taking one of the latest versions of ChatGPT as our object of study, we evaluate multilingual consistency for two different tasks across three different languages. We show that its multilingual consistency is still lacking, and that its task and world understanding are thus not language-independent. As our approach does not require any static evaluation corpora in languages other than English, it can easily and cheaply be extended to different languages and tasks and could become an integral part of future benchmarking efforts.

相關內容

可理解性

關注 6

正則化項 · 泛化理論 · 穩健性 · 示例 · 類別 ·

2024 年 2 月 8 日

Syntactically and semantically regular languages of lambda-terms coincide through logical relations

Vincent Moreau,Lê Thành D?ng Nguyên

from arxiv, The proofs on "finitely pointable" CCCs in versions 1 and 2 were wrong; we now make slightly weaker claims on well-pointed locally finite CCCs. New in this version: added reference [3] and official DOI (proceedings of CSL 2024)

A fundamental theme in automata theory is regular languages of words and trees, and their many equivalent definitions. Salvati has proposed a generalization to regular languages of simply typed $\lambda$-terms, defined using denotational semantics in finite sets. We provide here some evidence for its robustness. First, we give an equivalent syntactic characterization that naturally extends the seminal work of Hillebrand and Kanellakis connecting regular languages of words and syntactic $\lambda$-definability. Second, we show that any finitary extensional model of the simply typed $\lambda$-calculus, when used in Salvati's definition, recognizes exactly the same class of languages of $\lambda$-terms as the category of finite sets does. The proofs of these two results rely on logical relations and can be seen as instances of a more general construction of a categorical nature, inspired by previous categorical accounts of logical relations using the gluing construction.

LORA · tuning · MoDELS · 泛化理論 · Extensibility ·

2024 年 2 月 8 日

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

Yunhao Gou,Zhili Liu,Kai Chen,Lanqing Hong,Hang Xu,Aoxue Li,Dit-Yan Yeung,James T. Kwok,Yu Zhang

from arxiv, Project website: //kaichen1998.github.io/projects/mocle/

Instruction tuning of the Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks. However, diversity of training tasks of different sources and formats would lead to inevitable task conflicts, where different tasks conflicts for the same set of model parameters, resulting in sub-optimal instruction-following abilities. To address that, we propose the Mixture of Cluster-conditional LoRA Experts (MoCLE), a novel Mixture of Experts (MoE) architecture designed to activate the task-customized model parameters based on the instruction clusters. A separate universal expert is further incorporated to improve the generalization capabilities of MoCLE for novel instructions. Extensive experiments on 10 zero-shot tasks demonstrate the effectiveness of MoCLE.

Weight · Networking · 相關系數 · 可辨認的 · 統計量 ·

2024 年 2 月 7 日

Gradient descent induces alignment between weights and the empirical NTK for deep non-linear networks

Daniel Beaglehole,Ioannis Mitliagkas,Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved problems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we explain the emergence of this correlation. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and a significant component of the empirical neural tangent kernels associated with those weights. We establish that the NFA introduced in prior works is driven by a centered NFA that isolates this alignment. We show that the speed of NFA development can be predicted analytically at early training times in terms of simple statistics of the inputs and labels. Finally, we introduce a simple intervention to increase NFA correlation at any given layer, which dramatically improves the quality of features learned.

Microsoft Surface · 曲率 · CASE · 情景 · Performer ·

2024 年 2 月 7 日

A comparison of different approaches to compute surface tension contribution in incompressible two-phase flows

Giuseppe Orlando,Paolo Francesco Barbante,Luca Bonaventura

We perform a quantitative assessment of different strategies to compute the contribution due to surface tension in incompressible two-phase flows using a conservative level set (CLS) method. More specifically, we compare classical approaches, such as the direct computation of the curvature from the level set or the Laplace-Beltrami operator, with an evolution equation for the mean curvature recently proposed in literature. We consider the test case of a static bubble, for which an exact solution for the pressure jump across the interface is available, and the test case of an oscillating bubble, showing pros and cons of the different approaches.

降維 · MoDELS · 前向 · 輸出 · 高斯過程回歸 ·

2024 年 2 月 7 日

Dimensionality reduction can be used as a surrogate model for high-dimensional forward uncertainty quantification

Jungho Kim,Sang-ri Yi,Ziqi Wang

We introduce a method to construct a stochastic surrogate model from the results of dimensionality reduction in forward uncertainty quantification. The hypothesis is that the high-dimensional input augmented by the output of a computational model admits a low-dimensional representation. This assumption can be met by numerous uncertainty quantification applications with physics-based computational models. The proposed approach differs from a sequential application of dimensionality reduction followed by surrogate modeling, as we "extract" a surrogate model from the results of dimensionality reduction in the input-output space. This feature becomes desirable when the input space is genuinely high-dimensional. The proposed method also diverges from the Probabilistic Learning on Manifold, as a reconstruction mapping from the feature space to the input-output space is circumvented. The final product of the proposed method is a stochastic simulator that propagates a deterministic input into a stochastic output, preserving the convenience of a sequential "dimensionality reduction + Gaussian process regression" approach while overcoming some of its limitations. The proposed method is demonstrated through two uncertainty quantification problems characterized by high-dimensional input uncertainties.

Agent · 回合 · INFORMS · 語言模型化 · 大語言模型 ·

2024 年 2 月 7 日

S-Agents: self-organizing agents in open-ended environment

Jiaqi Chen,Yuxian Jiang,Jiachen Lu,Li Zhang

from arxiv, Preview, 23 pages, 12 figure

Leveraging large language models (LLMs), autonomous agents have significantly improved, gaining the ability to handle a variety of tasks. In open-ended settings, optimizing collaboration for efficiency and effectiveness demands flexible adjustments. Despite this, current research mainly emphasizes fixed, task-oriented workflows and overlooks agent-centric organizational structures. Drawing inspiration from human organizational behavior, we introduce a self-organizing agent system (S-Agents) with a "tree of agents" structure for dynamic workflow, an "hourglass agent architecture" for balancing information priorities, and a "non-obstructive collaboration" method to allow asynchronous task execution among agents. This structure can autonomously coordinate a group of agents, efficiently addressing the challenges of an open and dynamic environment without human intervention. Our experiments demonstrate that S-Agents proficiently execute collaborative building tasks and resource collection in the Minecraft environment, validating their effectiveness.

語言模型化 · TOOLS · MoDELS · 統計量 · AI ·

2024 年 2 月 6 日

AI language models as role-playing tools, not human participants

Zhicheng Lin

from arxiv, 6 pages, 1 table

Advances in AI invite misuse of language models as replacements for human participants. We argue that treating their responses as glimpses into an average human mind fundamentally mischaracterizes these statistical algorithms and that language models should be embraced as flexible simulation tools, able to mimic diverse behaviors without possessing human traits themselves.

估計/估計量 · 穩健性 · Performer · MoDELS · 樣例 ·

2024 年 2 月 6 日

CausalMetaR: An R package for performing causally interpretable meta-analyses

Guanbo Wang,Sean McGrath,Yi Lian,Issa Dahabreh

Researchers would often like to leverage data from a collection of sources (e.g., primary studies in a meta-analysis) to estimate causal effects in a target population of interest. However, traditional meta-analytic methods do not produce causally interpretable estimates for a well-defined target population. In this paper, we present the CausalMetaR R package, which implements efficient and robust methods to estimate causal effects in a given internal or external target population using multi-source data. The package includes estimators of average and subgroup treatment effects for the entire target population. To produce efficient and robust estimates of causal effects, the package implements doubly robust and non-parametric efficient estimators and supports using flexible data-adaptive (e.g., machine learning techniques) methods and cross-fitting techniques to estimate the nuisance models (e.g., the treatment model, the outcome model). We describe the key features of the package and demonstrate how to use the package through an example.

MoDELS · Automator · SimPLe · 泛化理論 · Better ·

2024 年 2 月 5 日

Redex -> Coq: towards a theory of decidability of Redex's reduction semantics

Mallku Soldevila,Rodrigo Ribeiro,Beta Ziliani

We propose the first steps in the development of a tool to automate the translation of Redex models into a (hopefully) semantically equivalent model in Coq, and to provide tactics to help in the certification of fundamental properties of such models. The work is heavily based on a model of Redex's semantics developed by Klein et al. By means of a simple generalization of the matching problem in Redex, we obtain an algorithm suitable for its mechanization in Coq, for which we prove its soundness properties and its correspondence with the original solution proposed by Klein et al. In the process, we also adequate some parts of our mechanization to better prepare it for the future inclusion of Redex features absent in the present model, like its Kleene-star operator. Finally, we discuss future avenues of development that are enabled by this work.

Continuity · state-of-the-art · 學成 · Extensibility · Networking ·

2021 年 4 月 16 日

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange,Rahaf Aljundi,Marc Masana,Sarah Parisot,Xu Jia,Ales Leonardis,Gregory Slabaugh,Tinne Tuytelaars

from arxiv, Accepted TPAMI paper, including Appendix, code publicly available

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.