亚洲成AV人片乱码色午夜刚交_国产乱来免费视频_人妻少妇波多野结衣系列_无码熟妇人妻A在线电影_A级毛片久久久久久精品_亚洲一区二区精品无码久久久_男人女人啪啪的网站

This paper comprehensively explores the ethical challenges arising from security threats to Large Language Models (LLMs). These intricate digital repositories are increasingly integrated into our daily lives, making them prime targets for attacks that can compromise their training data and the confidentiality of their data sources. The paper delves into the nuanced ethical repercussions of such security threats on society and individual privacy. We scrutinize five major threats--prompt injection, jailbreaking, Personal Identifiable Information (PII) exposure, sexually explicit content, and hate-based content--going beyond mere identification to assess their critical ethical consequences and the urgency they create for robust defensive strategies. The escalating reliance on LLMs underscores the crucial need for ensuring these systems operate within the bounds of ethical norms, particularly as their misuse can lead to significant societal and individual harm. We propose conceptualizing and developing an evaluative tool tailored for LLMs, which would serve a dual purpose: guiding developers and designers in preemptive fortification of backend systems and scrutinizing the ethical dimensions of LLM chatbot responses during the testing phase. By comparing LLM responses with those expected from humans in a moral context, we aim to discern the degree to which AI behaviors align with the ethical values held by a broader society. Ultimately, this paper not only underscores the ethical troubles presented by LLMs; it also highlights a path toward cultivating trust in these systems.

相關內容

大語言模(mo)型

關注 56

大語(yu)言(yan)(yan)模型是基于海(hai)量文本數據訓(xun)練的(de)(de)(de)(de)深度學習模型。它不(bu)僅能夠(gou)生成自然(ran)(ran)語(yu)言(yan)(yan)文本，還能夠(gou)深入理解文本含義，處理各種自然(ran)(ran)語(yu)言(yan)(yan)任務(wu)，如文本摘要、問答、翻譯等。2023年，大語(yu)言(yan)(yan)模型及其(qi)在人工智能領域(yu)的(de)(de)(de)(de)應用已(yi)成為全球科(ke)技研究的(de)(de)(de)(de)熱點，其(qi)在規(gui)模上的(de)(de)(de)(de)增長尤(you)為引人注目，參(can)數量已(yi)從最初的(de)(de)(de)(de)十幾億(yi)躍升到如今的(de)(de)(de)(de)一萬億(yi)。參(can)數量的(de)(de)(de)(de)提升使得模型能夠(gou)更加(jia)(jia)精細地捕捉人類語(yu)言(yan)(yan)微妙之處，更加(jia)(jia)深入地理解人類語(yu)言(yan)(yan)的(de)(de)(de)(de)復(fu)雜性。在過去(qu)的(de)(de)(de)(de)一年里(li)，大語(yu)言(yan)(yan)模型在吸(xi)納新知識、分解復(fu)雜任務(wu)以及圖文對齊(qi)等多方(fang)面都有顯(xian)著提升。隨著技術的(de)(de)(de)(de)不(bu)斷(duan)成熟，它將不(bu)斷(duan)拓展其(qi)應用范圍，為人類提供更加(jia)(jia)智能化(hua)和(he)個性化(hua)的(de)(de)(de)(de)服務(wu)，進一步(bu)改善(shan)人們的(de)(de)(de)(de)生活和(he)生產方(fang)式。

語言模型化 · 可辨認的 · 大語言模型 · MoDELS · 可理解性 ·

2024 年 4 月 15 日

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Usman Anwar,Abulhair Saparov,Javier Rando,Daniel Paleka,Miles Turpin,Peter Hase,Ekdeep Singh Lubana,Erik Jenner,Stephen Casper,Oliver Sourbut,Benjamin L. Edelman,Zhaowei Zhang,Mario Günther,Anton Korinek,Jose Hernandez-Orallo,Lewis Hammond,Eric Bigelow,Alexander Pan,Lauro Langosco,Tomasz Korbak,Heidi Zhang,Ruiqi Zhong,Seán ó héigeartaigh,Gabriel Recchia,Giulio Corsi,Alan Chan,Markus Anderljung,Lilian Edwards,Yoshua Bengio,Danqi Chen,Samuel Albanie,Tegan Maharaj,Jakob Foerster,Florian Tramer,He He,Atoosa Kasirzadeh,Yejin Choi,David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

MoDELS · 變換 · 優化器 · Taxonomy · HTTPS ·

2023 年 11 月 21 日

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

Yunpeng Huang,Jingwei Xu,Zixu Jiang,Junyu Lai,Zenan Li,Yuan Yao,Taolue Chen,Lijuan Yang,Zhou Xin,Xiaoxing Ma

from arxiv, 35 pages, 3 figures, 4 tables

With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs) have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been applied in diverse areas as knowledge bases, human interfaces, and dynamic agents. However, a prevailing limitation exists: many current LLMs, constrained by resources, are primarily pre-trained on shorter texts, rendering them less effective for longer-context prompts, commonly encountered in real-world settings. In this paper, we present a comprehensive survey focusing on the advancement of model architecture in Transformer-based LLMs to optimize long-context capabilities across all stages from pre-training to inference. We firstly delineate and analyze the problems of handling long-context input and output with the current Transformer-based models. Then, we mainly offer a holistic taxonomy to navigate the landscape of Transformer upgrades on architecture to solve these problems. Afterward, we provide the investigation on wildly used evaluation necessities tailored for long-context LLMs, including datasets, metrics, and baseline models, as well as some amazing optimization toolkits like libraries, systems, and compilers to augment LLMs' efficiency and efficacy across different stages. Finally, we further discuss the predominant challenges and potential avenues for future research in this domain. Additionally, we have established a repository where we curate relevant literature with real-time updates at //github.com/Strivin0311/long-llms-learning.

知識 (knowledge) · 圖 · 知識圖譜 · 數據集 · Vine ·

2023 年 5 月 22 日

LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities

Yuqi Zhu,Xiaohan Wang,Jing Chen,Shuofei Qiao,Yixin Ou,Yunzhi Yao,Shumin Deng,Huajun Chen,Ningyu Zhang

from arxiv, Work in progress

This paper presents an exhaustive quantitative and qualitative evaluation of Large Language Models (LLMs) for Knowledge Graph (KG) construction and reasoning. We employ eight distinct datasets that encompass aspects including entity, relation and event extraction, link prediction, and question answering. Empirically, our findings suggest that GPT-4 outperforms ChatGPT in the majority of tasks and even surpasses fine-tuned models in certain reasoning and question-answering datasets. Moreover, our investigation extends to the potential generalization ability of LLMs for information extraction, which culminates in the presentation of the Virtual Knowledge Extraction task and the development of the VINE dataset. Drawing on these empirical findings, we further propose AutoKG, a multi-agent-based approach employing LLMs for KG construction and reasoning, which aims to chart the future of this field and offer exciting opportunities for advancement. We anticipate that our research can provide invaluable insights for future undertakings of KG\footnote{Code and datasets will be available in //github.com/zjunlp/AutoKG.

CASES · 可理解性 · MoDELS · 語言模型化 · Processing（編程語言） ·

2023 年 4 月 26 日

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Jingfeng Yang,Hongye Jin,Ruixiang Tang,Xiaotian Han,Qizhang Feng,Haoming Jiang,Bing Yin,Xia Hu

This paper presents a comprehensive and practical guide for practitioners and end-users working with Large Language Models (LLMs) in their downstream natural language processing (NLP) tasks. We provide discussions and insights into the usage of LLMs from the perspectives of models, data, and downstream tasks. Firstly, we offer an introduction and brief summary of current GPT- and BERT-style LLMs. Then, we discuss the influence of pre-training data, training data, and test data. Most importantly, we provide a detailed discussion about the use and non-use cases of large language models for various natural language processing tasks, such as knowledge-intensive tasks, traditional natural language understanding tasks, natural language generation tasks, emergent abilities, and considerations for specific tasks.We present various use cases and non-use cases to illustrate the practical applications and limitations of LLMs in real-world scenarios. We also try to understand the importance of data and the specific challenges associated with each NLP task. Furthermore, we explore the impact of spurious biases on LLMs and delve into other essential considerations, such as efficiency, cost, and latency, to ensure a comprehensive understanding of deploying LLMs in practice. This comprehensive guide aims to provide researchers and practitioners with valuable insights and best practices for working with LLMs, thereby enabling the successful implementation of these models in a wide range of NLP tasks. A curated list of practical guide resources of LLMs, regularly updated, can be found at \url{//github.com/Mooler0410/LLMsPracticalGuide}.

知識 (knowledge) · MoDELS · 評論員 · 語言模型化 · Extensibility ·

2023 年 3 月 14 日

The Life Cycle of Knowledge in Big Language Models: A Survey

Boxi Cao,Hongyu Lin,Xianpei Han,Le Sun

from arxiv, paperlist: //github.com/c-box/KnowledgeLifecycle

Knowledge plays a critical role in artificial intelligence. Recently, the extensive success of pre-trained language models (PLMs) has raised significant attention about how knowledge can be acquired, maintained, updated and used by language models. Despite the enormous amount of related studies, there still lacks a unified view of how knowledge circulates within language models throughout the learning, tuning, and application processes, which may prevent us from further understanding the connections between current progress or realizing existing limitations. In this survey, we revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods, and investigating how knowledge circulates when it is built, maintained and used. To this end, we systematically review existing studies of each period of the knowledge life cycle, summarize the main challenges and current limitations, and discuss future directions.

泛化理論 · INFORMS · 估計/估計量 · 互信息 · 泛化誤差 ·

2021 年 6 月 18 日

A Probabilistic Representation of DNNs: Bridging Mutual Information and Generalization

Xinjie Lan,Kenneth Barner

from arxiv, To appear in the ICML 2021 Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI

Recently, Mutual Information (MI) has attracted attention in bounding the generalization error of Deep Neural Networks (DNNs). However, it is intractable to accurately estimate the MI in DNNs, thus most previous works have to relax the MI bound, which in turn weakens the information theoretic explanation for generalization. To address the limitation, this paper introduces a probabilistic representation of DNNs for accurately estimating the MI. Leveraging the proposed MI estimator, we validate the information theoretic explanation for generalization, and derive a tighter generalization bound than the state-of-the-art relaxations.

跳躍連接 · Neural Networks · 優化器 · 線性的 · 圖 ·

2021 年 5 月 10 日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Keyulu Xu,Mozhi Zhang,Stefanie Jegelka,Kenji Kawaguchi

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.

Performer · 預測器/決策函數 · 數據集 · Better · 估計/估計量 ·

2021 年 3 月 10 日

ReNAS:Relativistic Evaluation of Neural Architecture Search

Yixing Xu,Yunhe Wang,Kai Han,Yehui Tang,Shangling Jui,Chunjing Xu,Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS). To save computational cost, most of existing NAS algorithms often train and evaluate intermediate neural architectures on a small proxy dataset with limited training epochs. But it is difficult to expect an accurate performance estimation of an architecture in such a coarse evaluation way. This paper advocates a new neural architecture evaluation scheme, which aims to determine which architecture would perform better instead of accurately predict the absolute architecture performance. Therefore, we propose a \textbf{relativistic} architecture performance predictor in NAS (ReNAS). We encode neural architectures into feature tensors, and further refining the representations with the predictor. The proposed relativistic performance predictor can be deployed in discrete searching methods to search for the desired architectures without additional evaluation. Experimental results on NAS-Bench-101 dataset suggests that, sampling 424 ($0.1\%$ of the entire search space) neural architectures and their corresponding validation performance is already enough for learning an accurate architecture performance predictor. The accuracies of our searched neural architectures on NAS-Bench-101 and NAS-Bench-201 datasets are higher than that of the state-of-the-art methods and show the priority of the proposed method.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

三維重建 · 3D · Networks · Networking · Neural Networks ·

2018 年 12 月 10 日

Occupancy Networks: Learning 3D Reconstruction in Function Space

Lars Mescheder,Michael Oechsle,Michael Niemeyer,Sebastian Nowozin,Andreas Geiger

With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose occupancy networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.