男男网站网址视频免费观看,影视先锋AV中文字幕

Underlying data distributions of natural language, programming code, and mathematical symbols vary vastly, presenting a complex challenge for large language models (LLMs) that strive to achieve high performance across all three domains simultaneously. Achieving a very high level of proficiency for an LLM within a specific domain often requires extensive training with relevant corpora, which is typically accompanied by a sacrifice in performance in other domains. In this paper, we propose to fuse models that are already highly-specialized directly. The proposed fusing framework, UltraFuser, consists of three distinct specialists that are already sufficiently trained on language, coding, and mathematics. A token-level gating mechanism is introduced to blend the specialists' outputs. A two-stage training strategy accompanied by balanced sampling is designed to ensure stability. To effectively train the fused model, we further construct a high-quality supervised instruction tuning dataset, UltraChat 2, which includes text, code, and mathematical content. This dataset comprises approximately 300,000 instructions and covers a wide range of topics in each domain. Experiments show that our model could simultaneously achieve mastery of the three crucial domains.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 等變 · Learning · FAST · 數據點 ·

2024 年 4 月 29 日

Predicting PDEs Fast and Efficiently with Equivariant Extreme Learning Machines

Hans Harder,Sebastian Peitz

We utilize extreme learning machines for the prediction of partial differential equations (PDEs). Our method splits the state space into multiple windows that are predicted individually using a single model. Despite requiring only few data points (in some cases, our method can learn from a single full-state snapshot), it still achieves high accuracy and can predict the flow of PDEs over long time horizons. Moreover, we show how additional symmetries can be exploited to increase sample efficiency and to enforce equivariance.

樣本 · state-of-the-art · Integration · 離散化 · prototype ·

2024 年 4 月 28 日

Symbolic Execution for Quantum Error Correction Programs

Wang Fang,Mingsheng Ying

from arxiv, 41pages, 11 figures. v2: fix inappropriate use of Stim. v3: Extended version of PLDI 2024 publication

We define QSE, a symbolic execution framework for quantum programs by integrating symbolic variables into quantum states and the outcomes of quantum measurements. The soundness of QSE is established through a theorem that ensures the correctness of symbolic execution within operational semantics. We further introduce symbolic stabilizer states, which symbolize the phases of stabilizer generators, for the efficient analysis of quantum error correction (QEC) programs. Within the QSE framework, we can use symbolic expressions to characterize the possible discrete Pauli errors in QEC, providing a significant improvement over existing methods that rely on sampling with simulators. We implement QSE with the support of symbolic stabilizer states in a prototype tool named QuantumSE.jl. Our experiments on representative QEC codes, including quantum repetition codes, Kitaev's toric codes, and quantum Tanner codes, demonstrate the efficiency of QuantumSE.jl for debugging QEC programs with over 1000 qubits. In addition, by substituting concrete values in symbolic expressions of measurement results, QuantumSE.jl is also equipped with a sampling feature for stabilizer circuits. Despite a longer initialization time than the state-of-the-art stabilizer simulator, Google's Stim, QuantumSE.jl offers a quicker sampling rate in the experiments.

不變 · 推斷 · 圖 · 縮放 · Automator ·

2024 年 4 月 28 日

Scalable, Interpretable Distributed Protocol Verification by Inductive Proof Slicing

William Schultz,Edward Ashton,Heidi Howard,Stavros Tripakis

Many techniques for automated inference of inductive invariants for distributed protocols have been developed over the past several years, but their performance can still be unpredictable and their failure modes opaque for large-scale verification tasks. In this paper, we present inductive proof slicing, a new automated, compositional technique for inductive invariant inference that scales effectively to large distributed protocol verification tasks. Our technique is built on a core, novel data structure, the inductive proof graph, which explicitly represents the lemma and action dependencies of an inductive invariant and is built incrementally during the inference procedure, backwards from a target safety property. We present an invariant inference algorithm that integrates localized syntax-guided lemma synthesis routines at nodes of this graph, which are accelerated by computation of localized grammar and state variable slices. Additionally, in the case of failure to produce a complete inductive invariant, maintenance of this proof graph structure allows failures to be localized to small sub-components of this graph, enabling fine-grained failure diagnosis and repair by a user. We evaluate our technique on several complex distributed and concurrent protocols, including a large scale specification of the Raft consensus protocol, which is beyond the capabilities of modern distributed protocol verification tools, and also demonstrate how its interpretability features allow effective diagnosis and repair in cases of initial failure.

估計/估計量 · Integration · 近似 · Continuity · UniFormer ·

2024 年 4 月 27 日

Modified Trapezoidal Product Cubature Rules. Definiteness, Monotonicity and a Posteriori Error Estimates

Geno Nikolov,Petar Nikolov

from arxiv, 18 pages

We study two modifications of the trapezoidal product cubature formulae, approximating double integrals over the square domain $[a,b]^2=[a,b]\times [a,b]$. Our modified cubature formulae use mixed type data: except evaluations of the integrand on the points forming a uniform grid on $[a,b]^2$, they involve two or four univariate integrals. An useful property of these cubature formulae is that they are definite of order $(2,2)$, that is, they provide one-sided approximation to the double integral for real-valued integrands from the class $$ \mathcal{C}^{2,2}[a,b]=\{f(x,y)\,:\,\frac{\partial^4 f}{\partial x^2\partial y^2}\ \text{continuous and does not change sign in}\ (a,b)^2\}. $$ For integrands from $\mathcal{C}^{2,2}[a,b]$ we prove monotonicity of the remainders and derive a-posteriori error estimates.

Analysis · 語言模型化 · MoDELS · Automator · 大語言模型 ·

2024 年 4 月 26 日

Enhancing Legal Compliance and Regulation Analysis with Large Language Models

Shabnam Hassani

from arxiv, to be published in 32nd IEEE International Requirements Engineering 2024 Conference (RE'24) - Doctoral Symposium. arXiv admin note: text overlap with arXiv:2404.14356

This research explores the application of Large Language Models (LLMs) for automating the extraction of requirement-related legal content in the food safety domain and checking legal compliance of regulatory artifacts. With Industry 4.0 revolutionizing the food industry and with the General Data Protection Regulation (GDPR) reshaping privacy policies and data processing agreements, there is a growing gap between regulatory analysis and recent technological advancements. This study aims to bridge this gap by leveraging LLMs, namely BERT and GPT models, to accurately classify legal provisions and automate compliance checks. Our findings demonstrate promising results, indicating LLMs' significant potential to enhance legal compliance and regulatory analysis efficiency, notably by reducing manual workload and improving accuracy within reasonable time and financial constraints.

Analysis · 可理解性 · contrastive · Performer · 輸入分布 ·

2024 年 4 月 26 日

Towards Understanding In-Context Learning with Contrastive Demonstrations and Saliency Maps

Fuxiao Liu,Paiheng Xu,Zongxia Li,Yue Feng,Hyemi Song

from arxiv, 10 pages, 5 figures

We investigate the role of various demonstration components in the in-context learning (ICL) performance of large language models (LLMs). Specifically, we explore the impacts of ground-truth labels, input distribution, and complementary explanations, particularly when these are altered or perturbed. We build on previous work, which offers mixed findings on how these elements influence ICL. To probe these questions, we employ explainable NLP (XNLP) methods and utilize saliency maps of contrastive demonstrations for both qualitative and quantitative analysis. Our findings reveal that flipping ground-truth labels significantly affects the saliency, though it's more noticeable in larger LLMs. Our analysis of the input distribution at a granular level reveals that changing sentiment-indicative terms in a sentiment analysis task to neutral ones does not have as substantial an impact as altering ground-truth labels. Finally, we find that the effectiveness of complementary explanations in boosting ICL performance is task-dependent, with limited benefits seen in sentiment analysis tasks compared to symbolic reasoning tasks. These insights are critical for understanding the functionality of LLMs and guiding the development of effective demonstrations, which is increasingly relevant in light of the growing use of LLMs in applications such as ChatGPT. Our research code is publicly available at //github.com/paihengxu/XICL.

Performer · Networking · 預測器/決策函數 · 圖 · 估計/估計量 ·

2024 年 4 月 25 日

Surprisingly Strong Performance Prediction with Neural Graph Features

Gabriela Kadlecová,Jovita Lukasik,Martin Pilát,Petra Vidnerová,Mahmoud Safari,Roman Neruda,Frank Hutter

from arxiv, 45 pages, 30 figures

Performance prediction has been a key part of the neural architecture search (NAS) process, allowing to speed up NAS algorithms by avoiding resource-consuming network training. Although many performance predictors correlate well with ground truth performance, they require training data in the form of trained networks. Recently, zero-cost proxies have been proposed as an efficient method to estimate network performance without any training. However, they are still poorly understood, exhibit biases with network properties, and their performance is limited. Inspired by the drawbacks of zero-cost proxies, we propose neural graph features (GRAF), simple to compute properties of architectural graphs. GRAF offers fast and interpretable performance prediction while outperforming zero-cost proxies and other common encodings. In combination with other zero-cost proxies, GRAF outperforms most existing performance predictors at a fraction of the cost.

CASE · 均方誤差 · 正則化項 · 離散化 · 方陣 ·

2024 年 4 月 25 日

Multilevel Particle Filters for Partially Observed McKean-Vlasov Stochastic Differential Equations

Elsiddig Awadelkarim,Ajay Jasra

from arxiv, 21 pages, 2 figures

In this paper we consider the filtering problem associated to partially observed McKean-Vlasov stochastic differential equations (SDEs). The model consists of data that are observed at regular and discrete times and the objective is to compute the conditional expectation of (functionals) of the solutions of the SDE at the current time. This problem, even the ordinary SDE case is challenging and requires numerical approximations. Based upon the ideas in [3, 12] we develop a new particle filter (PF) and multilevel particle filter (MLPF) to approximate the afore-mentioned expectations. We prove under assumptions that, for $\epsilon>0$, to obtain a mean square error of $\mathcal{O}(\epsilon^2)$ the PF has a cost per-observation time of $\mathcal{O}(\epsilon^{-5})$ and the MLPF costs $\mathcal{O}(\epsilon^{-4})$ (best case) or $\mathcal{O}(\epsilon^{-4}\log(\epsilon)^2)$ (worst case). Our theoretical results are supported by numerical experiments.

共軛梯度 · 共軛 · MIMO · Performer · 推斷 ·

2024 年 4 月 24 日

Fast and Robust Expectation Propagation MIMO Detection via Preconditioned Conjugated Gradient

Luca Schmid,Dominik Sulz,Laurent Schmalen

from arxiv, Submitted to IEEE

We study the expectation propagation (EP) algorithm for symbol detection in massive multiple-input multiple-output (MIMO) systems. The EP detector shows excellent performance but suffers from a high computational complexity due to the matrix inversion, required in each EP iteration to perform marginal inference on a Gaussian system. We propose an inversion-free variant of the EP algorithm by treating inference on the mean and variance as two separate and simpler subtasks: We study the preconditioned conjugate gradient algorithm for obtaining the mean, which can significantly reduce the complexity and increase stability by relying on the Jacobi preconditioner that proves to fit the EP characteristics very well. For the variance, we use a simple approximation based on linear regression of the Gram channel matrix. Numerical studies on the Rayleigh-fading channel and on a realistic 3GPP channel model reveal the efficiency of the proposed scheme, which offers an attractive performance-complexity tradeoff and even outperforms the original EP detector in high multi-user inference cases where the matrix inversion becomes numerically unstable.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.