亚洲精品无码国产爽快A片百度,精品人妻视频一区二区三区,女女啪啪激烈高潮喷出网站免费,中文字幕在线视频永久,国产精品一级一级免费视频

The inherent probabilistic nature of Large Language Models (LLMs) introduces an element of unpredictability, raising concerns about potential discrepancies in their output. This paper introduces an innovative approach aims to generate correct and optimal robotic task plans for diverse real-world demands and scenarios. LLMs have been used to generate task plans, but they are unreliable and may contain wrong, questionable, or high-cost steps. The proposed approach uses LLM to generate a number of task plans as trees and amalgamates them into a graph by removing questionable paths. Then an optimal task tree can be retrieved to circumvent questionable and high-cost nodes, thereby improving planning accuracy and execution efficiency. The approach is further improved by incorporating a large knowledge network. Leveraging GPT-4 further, the high-level task plan is converted into a low-level Planning Domain Definition Language (PDDL) plan executable by a robot. Evaluation results highlight the superior accuracy and efficiency of our approach compared to previous methodologies in the field of task planning.

相關內容

大語言模(mo)型

關注 56

大(da)語言(yan)模(mo)型(xing)(xing)是基于(yu)海量文(wen)本(ben)數據訓練的(de)(de)深度學(xue)習模(mo)型(xing)(xing)。它(ta)不僅(jin)能(neng)(neng)(neng)夠(gou)生(sheng)(sheng)成(cheng)自然(ran)語言(yan)文(wen)本(ben)，還(huan)能(neng)(neng)(neng)夠(gou)深入理(li)解(jie)文(wen)本(ben)含義，處(chu)理(li)各(ge)種自然(ran)語言(yan)任(ren)務，如文(wen)本(ben)摘要、問(wen)答、翻譯等。2023年(nian)，大(da)語言(yan)模(mo)型(xing)(xing)及其在(zai)人(ren)(ren)(ren)工智能(neng)(neng)(neng)領域的(de)(de)應(ying)用已成(cheng)為全球科技(ji)研究的(de)(de)熱點，其在(zai)規模(mo)上(shang)的(de)(de)增長尤(you)為引人(ren)(ren)(ren)注目，參(can)數量已從(cong)最(zui)初(chu)的(de)(de)十幾(ji)億躍升到如今的(de)(de)一萬(wan)億。參(can)數量的(de)(de)提(ti)升使得模(mo)型(xing)(xing)能(neng)(neng)(neng)夠(gou)更(geng)加精(jing)細地捕(bu)捉人(ren)(ren)(ren)類(lei)語言(yan)微(wei)妙(miao)之處(chu)，更(geng)加深入地理(li)解(jie)人(ren)(ren)(ren)類(lei)語言(yan)的(de)(de)復雜性。在(zai)過去(qu)的(de)(de)一年(nian)里，大(da)語言(yan)模(mo)型(xing)(xing)在(zai)吸納(na)新知(zhi)識(shi)、分(fen)解(jie)復雜任(ren)務以(yi)及圖文(wen)對齊等多方面都有顯著(zhu)提(ti)升。隨著(zhu)技(ji)術的(de)(de)不斷成(cheng)熟(shu)，它(ta)將不斷拓展其應(ying)用范圍，為人(ren)(ren)(ren)類(lei)提(ti)供更(geng)加智能(neng)(neng)(neng)化和個性化的(de)(de)服(fu)務，進(jin)一步改善人(ren)(ren)(ren)們(men)的(de)(de)生(sheng)(sheng)活和生(sheng)(sheng)產方式(shi)。

Networking · 自頂向下 · Neural Networks · 生物學合理性 · Learning ·

2024 年 2 月 28 日

Biologically Plausible Training of Deep Neural Networks Using a Top-down Credit Assignment Network

Jian-Hui Chen,Cheng-Lin Liu,Zuoren Wang

Despite the widespread adoption of Backpropagation algorithm-based Deep Neural Networks, the biological infeasibility of the BP algorithm could potentially limit the evolution of new DNN models. To find a biologically plausible algorithm to replace BP, we focus on the top-down mechanism inherent in the biological brain. Although top-down connections in the biological brain play crucial roles in high-level cognitive functions, their application to neural network learning remains unclear. This study proposes a two-level training framework designed to train a bottom-up network using a Top-Down Credit Assignment Network (TDCA-network). The TDCA-network serves as a substitute for the conventional loss function and the back-propagation algorithm, widely used in neural network training. We further introduce a brain-inspired credit diffusion mechanism, significantly reducing the TDCA-network's parameter complexity, thereby greatly accelerating training without compromising the network's performance.Our experiments involving non-convex function optimization, supervised learning, and reinforcement learning reveal that a well-trained TDCA-network outperforms back-propagation across various settings. The visualization of the update trajectories in the loss landscape indicates the TDCA-network's ability to bypass local minima where BP-based trajectories typically become trapped. The TDCA-network also excels in multi-task optimization, demonstrating robust generalizability across different datasets in supervised learning and unseen task settings in reinforcement learning. Moreover, the results indicate that the TDCA-network holds promising potential to train neural networks across diverse architectures.

估計/估計量 · binary · 正則化項 · INFORMS · CASE ·

2024 年 2 月 26 日

The Complexity of Algebraic Algorithms for LWE

Matthias Johann Steiner

Arora & Ge introduced a noise-free polynomial system to compute the secret of a Learning With Errors (LWE) instance via linearization. Albrecht et al. later utilized the Arora-Ge polynomial model to study the complexity of Gr\"obner basis computations on LWE polynomial systems under the assumption of semi-regularity. In this paper we revisit the Arora-Ge polynomial and prove that it satisfies a genericity condition recently introduced by Caminata & Gorla, called being in generic coordinates. For polynomial systems in generic coordinates one can always estimate the complexity of DRL Gr\"obner basis computations in terms of the Castelnuovo-Mumford regularity and henceforth also via the Macaulay bound. Moreover, we generalize the Gr\"obner basis algorithm of Semaev & Tenti to arbitrary polynomial systems with a finite degree of regularity. In particular, existence of this algorithm yields another approach to estimate the complexity of DRL Gr\"obner basis computations in terms of the degree of regularity. In practice, the degree of regularity of LWE polynomial systems is not known, though one can always estimate the lowest achievable degree of regularity. Consequently, from a designer's worst case perspective this approach yields sub-exponential complexity estimates for general, binary secret and binary error LWE. In recent works by Dachman-Soled et al. the hardness of LWE in the presence of side information was analyzed. Utilizing their framework we discuss how hints can be incorporated into LWE polynomial systems and how they affect the complexity of Gr\"obner basis computations.

Performer · 優化器 · 差分進化 · state-of-the-art · Better ·

2024 年 2 月 26 日

Performance Comparison of Surrogate-Assisted Evolutionary Algorithms on Computational Fluid Dynamics Problems

Jakub Kudela,Ladislav Dobrovsky

Surrogate-assisted evolutionary algorithms (SAEAs) are recently among the most widely studied methods for their capability to solve expensive real-world optimization problems. However, the development of new methods and benchmarking with other techniques still relies almost exclusively on artificially created problems. In this paper, we use two real-world computational fluid dynamics problems to compare the performance of eleven state-of-the-art single-objective SAEAs. We analyze the performance by investigating the quality and robustness of the obtained solutions and the convergence properties of the selected methods. Our findings suggest that the more recently published methods, as well as the techniques that utilize differential evolution as one of their optimization mechanisms, perform significantly better than the other considered methods.

MoDELS · 語言模型化 · 大語言模型 · 代碼 · Weight ·

2024 年 2 月 23 日

On Trojan Signatures in Large Language Models of Code

Aftab Hussain,Md Rafiqul Islam Rabin,Mohammad Amin Alipour

Trojan signatures, as described by Fields et al. (2021), are noticeable differences in the distribution of the trojaned class parameters (weights) and the non-trojaned class parameters of the trojaned model, that can be used to detect the trojaned model. Fields et al. (2021) found trojan signatures in computer vision classification tasks with image models, such as, Resnet, WideResnet, Densenet, and VGG. In this paper, we investigate such signatures in the classifier layer parameters of large language models of source code. Our results suggest that trojan signatures could not generalize to LLMs of code. We found that trojaned code models are stubborn, even when the models were poisoned under more explicit settings (finetuned with pre-trained weights frozen). We analyzed nine trojaned models for two binary classification tasks: clone and defect detection. To the best of our knowledge, this is the first work to examine weight-based trojan signature revelation techniques for large-language models of code and furthermore to demonstrate that detecting trojans only from the weights in such models is a hard problem.

蒙特卡羅 · 估計/估計量 · SMC · MoDELS · 推斷 ·

2024 年 2 月 23 日

Towards Improved Uncertainty Quantification of Stochastic Epidemic Models Using Sequential Monte Carlo

Arindam Fadikar,Abby Stevens,Nicholson Collier,Kok Ben Toh,Olga Morozova,Anna Hotton,Jared Clark,David Higdon,Jonathan Ozik

from arxiv, 10 pages, 5 figures

Sequential Monte Carlo (SMC) algorithms represent a suite of robust computational methodologies utilized for state estimation and parameter inference within dynamical systems, particularly in real-time or online environments where data arrives sequentially over time. In this research endeavor, we propose an integrated framework that combines a stochastic epidemic simulator with a sequential importance sampling (SIS) scheme to dynamically infer model parameters, which evolve due to social as well as biological processes throughout the progression of an epidemic outbreak and are also influenced by evolving data measurement bias. Through iterative updates of a set of weighted simulated trajectories based on observed data, this framework enables the estimation of posterior distributions for these parameters, thereby capturing their temporal variability and associated uncertainties. Through simulation studies, we showcase the efficacy of SMC in accurately tracking the evolving dynamics of epidemics while appropriately accounting for uncertainties. Moreover, we delve into practical considerations and challenges inherent in implementing SMC for parameter estimation within dynamic epidemiological settings, areas where the substantial computational capabilities of high-performance computing resources can be usefully brought to bear.

流 · 隨機場 · MoDELS · Learning · INFORMS ·

2024 年 2 月 23 日

Streaming Gaussian Dirichlet Random Fields for Spatial Predictions of High Dimensional Categorical Observations

J. E. San Soucie,H. M. Sosik,Y. Girdhar

from arxiv, 10 pages, 5 figures. Published in Springer Proceedings of Advanced Robotics, ISER 2023 Conference Proceedings

We present the Streaming Gaussian Dirichlet Random Field (S-GDRF) model, a novel approach for modeling a stream of spatiotemporally distributed, sparse, high-dimensional categorical observations. The proposed approach efficiently learns global and local patterns in spatiotemporal data, allowing for fast inference and querying with a bounded time complexity. Using a high-resolution data series of plankton images classified with a neural network, we demonstrate the ability of the approach to make more accurate predictions compared to a Variational Gaussian Process (VGP), and to learn a predictive distribution of observations from streaming categorical data. S-GDRFs open the door to enabling efficient informative path planning over high-dimensional categorical observations, which until now has not been feasible.

MoDELS · 大語言模型 · 語言模型化 · 知識 (knowledge) · 蒸餾 ·

2024 年 2 月 23 日

A Survey on Knowledge Distillation of Large Language Models

Xiaohan Xu,Ming Li,Chongyang Tao,Tao Shen,Reynold Cheng,Jinyang Li,Can Xu,Dacheng Tao,Tianyi Zhou

from arxiv, 43 pages

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral. Additionally, as open-source LLMs flourish, KD plays a crucial role in both compressing these models, and facilitating their self-improvement by employing themselves as teachers. This paper presents a comprehensive survey of KD's role within the realm of LLM, highlighting its critical function in imparting advanced knowledge to smaller models and its utility in model compression and self-improvement. Our survey is meticulously structured around three foundational pillars: \textit{algorithm}, \textit{skill}, and \textit{verticalization} -- providing a comprehensive examination of KD mechanisms, the enhancement of specific cognitive abilities, and their practical implications across diverse fields. Crucially, the survey navigates the intricate interplay between data augmentation (DA) and KD, illustrating how DA emerges as a powerful paradigm within the KD framework to bolster LLMs' performance. By leveraging DA to generate context-rich, skill-specific training data, KD transcends traditional boundaries, enabling open-source models to approximate the contextual adeptness, ethical alignment, and deep semantic insights characteristic of their proprietary counterparts. This work aims to provide an insightful guide for researchers and practitioners, offering a detailed overview of current methodologies in KD and proposing future research directions. Importantly, we firmly advocate for compliance with the legal terms that regulate the use of LLMs, ensuring ethical and lawful application of KD of LLMs. An associated Github repository is available at //github.com/Tebmer/Awesome-Knowledge-Distillation-of-LLMs.

BERT · Performer · Extensibility · 注意力機制 · MoDELS ·

2021 年 2 月 22 日

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

Tingyu Xia,Yue Wang,Yuan Tian,Yi Chang

from arxiv, 10 pages, WWW'21, April19-23, 2021, Ljubljana, Slovenia

We study the problem of incorporating prior knowledge into a deep Transformer-based model,i.e.,Bidirectional Encoder Representations from Transformers (BERT), to enhance its performance on semantic textual matching tasks. By probing and analyzing what BERT has already known when solving this task, we obtain better understanding of what task-specific knowledge BERT needs the most and where it is most needed. The analysis further motivates us to take a different approach than most existing works. Instead of using prior knowledge to create a new training task for fine-tuning BERT, we directly inject knowledge into BERT's multi-head attention mechanism. This leads us to a simple yet effective approach that enjoys fast training stage as it saves the model from training on additional data or tasks other than the main task. Extensive experiments demonstrate that the proposed knowledge-enhanced BERT is able to consistently improve semantic textual matching performance over the original BERT model, and the performance benefit is most salient when training data is scarce.

優化器 · 圖 · 圖形處理器 · Neural Networks · 核化 ·

2021 年 1 月 28 日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Meiqi Zhu,Xiao Wang,Chuan Shi,Houye Ji,Peng Cui

from arxiv, WWW2021, 12 pages

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.

視覺問答 · 自動問答 · MoDELS · 可辨認的 · 注意力機制 ·

2018 年 2 月 15 日

Learning to Count Objects in Natural Images for Visual Question Answering

Yan Zhang,Jonathon Hare,Adam Prügel-Bennett

from arxiv, Published in ICLR 2018

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.