亚洲十八禁无码在线免费观看_欧美色欧美专区第一页_人妻中文系列无码专区_亚洲国产欧美不卡一区二区三区_奇米影视四色888综合_日本一区二区三区电影在线观看_韩国三级HD中文字幕

Seungjae Moon,Jung-Hoon Kim,Junsoo Kim,Seongmin Hong,Junseo Cha,Minsu Kim,Sukbin Lim,Gyubin Choi,Dongjin Seo,Jongho Kim,Hunjong Lee,Hyunjun Park,Ryeowook Ko,Soongyu Choi,Jongse Park,Jinwon Lee,Joo-Young Kim

The explosive arrival of OpenAI's ChatGPT has fueled the globalization of large language model (LLM), which consists of billions of pretrained parameters that embodies the aspects of syntax and semantics. HyperAccel introduces latency processing unit (LPU), a latency-optimized and highly scalable processor architecture for the acceleration of LLM inference. LPU perfectly balances the memory bandwidth and compute logic with streamlined dataflow to maximize performance and efficiency. LPU is equipped with expandable synchronization link (ESL) that hides data synchronization latency between multiple LPUs. HyperDex complements LPU as an intuitive software framework to run LLM applications. LPU achieves 1.25 ms/token and 20.9 ms/token for 1.3B and 66B model, respectively, which is 2.09x and 1.37x faster than the GPU. LPU, synthesized using Samsung 4nm process, has total area of 0.824 mm2 and power consumption of 284.31 mW. LPU-based servers achieve 1.33x and 1.32x energy efficiency over NVIDIA H100 and L4 servers, respectively.

相關內容

大語言模型(xing)

關注 56

大(da)語言(yan)(yan)(yan)(yan)模(mo)型是基(ji)于海量(liang)文(wen)本數(shu)據訓(xun)練(lian)的(de)(de)(de)深(shen)度學(xue)習(xi)模(mo)型。它不(bu)僅能(neng)夠生(sheng)(sheng)成(cheng)自然(ran)(ran)語言(yan)(yan)(yan)(yan)文(wen)本，還能(neng)夠深(shen)入(ru)理(li)解(jie)文(wen)本含義，處(chu)(chu)理(li)各種自然(ran)(ran)語言(yan)(yan)(yan)(yan)任務(wu)，如文(wen)本摘要、問答、翻譯等(deng)。2023年，大(da)語言(yan)(yan)(yan)(yan)模(mo)型及其在(zai)人工(gong)智(zhi)能(neng)領域的(de)(de)(de)應用(yong)已成(cheng)為(wei)全(quan)球科技(ji)研究的(de)(de)(de)熱(re)點，其在(zai)規模(mo)上的(de)(de)(de)增長尤為(wei)引人注(zhu)目，參數(shu)量(liang)已從最(zui)初的(de)(de)(de)十(shi)幾(ji)億躍升(sheng)(sheng)到如今的(de)(de)(de)一(yi)(yi)萬億。參數(shu)量(liang)的(de)(de)(de)提(ti)升(sheng)(sheng)使得模(mo)型能(neng)夠更加(jia)精細地(di)捕捉人類語言(yan)(yan)(yan)(yan)微妙(miao)之處(chu)(chu)，更加(jia)深(shen)入(ru)地(di)理(li)解(jie)人類語言(yan)(yan)(yan)(yan)的(de)(de)(de)復(fu)雜(za)性。在(zai)過去的(de)(de)(de)一(yi)(yi)年里，大(da)語言(yan)(yan)(yan)(yan)模(mo)型在(zai)吸(xi)納新(xin)知識、分解(jie)復(fu)雜(za)任務(wu)以及圖文(wen)對齊(qi)等(deng)多方(fang)面都有顯著提(ti)升(sheng)(sheng)。隨(sui)著技(ji)術(shu)的(de)(de)(de)不(bu)斷成(cheng)熟，它將不(bu)斷拓(tuo)展其應用(yong)范圍，為(wei)人類提(ti)供更加(jia)智(zhi)能(neng)化和個性化的(de)(de)(de)服務(wu)，進(jin)一(yi)(yi)步改(gai)善(shan)人們的(de)(de)(de)生(sheng)(sheng)活和生(sheng)(sheng)產方(fang)式。

entity · MoDELS · 知識 (knowledge) · 正交 · 圖 ·

2024 年 10 月 2 日

Block-Diagonal Orthogonal Relation and Matrix Entity for Knowledge Graph Embedding

Yihua Zhu,Hidetoshi Shimodaira

from arxiv, EMNLP2024 findings (Long)

The primary aim of Knowledge Graph embeddings (KGE) is to learn low-dimensional representations of entities and relations for predicting missing facts. While rotation-based methods like RotatE and QuatE perform well in KGE, they face two challenges: limited model flexibility requiring proportional increases in relation size with entity dimension, and difficulties in generalizing the model for higher-dimensional rotations. To address these issues, we introduce OrthogonalE, a novel KGE model employing matrices for entities and block-diagonal orthogonal matrices with Riemannian optimization for relations. This approach enhances the generality and flexibility of KGE models. The experimental results indicate that our new KGE model, OrthogonalE, is both general and flexible, significantly outperforming state-of-the-art KGE models while substantially reducing the number of relation parameters.

語言模型化 · 數據集 · MoDELS · 知識 (knowledge) · 大語言模型 ·

2024 年 9 月 30 日

LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models

Haitao Li,You Chen,Qingyao Ai,Yueyue Wu,Ruizhe Zhang,Yiqun Liu

from arxiv, NeurIPs 2024

Large language models (LLMs) have made significant progress in natural language processing tasks and demonstrate considerable potential in the legal domain. However, legal applications demand high standards of accuracy, reliability, and fairness. Applying existing LLMs to legal systems without careful evaluation of their potential and limitations could pose significant risks in legal practice. To this end, we introduce a standardized comprehensive Chinese legal benchmark LexEval. This benchmark is notable in the following three aspects: (1) Ability Modeling: We propose a new taxonomy of legal cognitive abilities to organize different tasks. (2) Scale: To our knowledge, LexEval is currently the largest Chinese legal evaluation dataset, comprising 23 tasks and 14,150 questions. (3) Data: we utilize formatted existing datasets, exam datasets and newly annotated datasets by legal experts to comprehensively evaluate the various capabilities of LLMs. LexEval not only focuses on the ability of LLMs to apply fundamental legal knowledge but also dedicates efforts to examining the ethical issues involved in their application. We evaluated 38 open-source and commercial LLMs and obtained some interesting findings. The experiments and findings offer valuable insights into the challenges and potential solutions for developing Chinese legal systems and LLM evaluation pipelines. The LexEval dataset and leaderboard are publicly available at \url{//github.com/CSHaitao/LexEval} and will be continuously updated.

蒸餾 · 知識 (knowledge) · 穩健性 · Learning · 可約的 ·

2024 年 9 月 30 日

HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning

Momin Ahmad Khan,Yasra Chandio,Fatima Muhammad Anwar

Data heterogeneity among Federated Learning (FL) users poses a significant challenge, resulting in reduced global model performance. The community has designed various techniques to tackle this issue, among which Knowledge Distillation (KD)-based techniques are common. While these techniques effectively improve performance under high heterogeneity, they inadvertently cause higher accuracy degradation under model poisoning attacks (known as attack amplification). This paper presents a case study to reveal this critical vulnerability in KD-based FL systems. We show why KD causes this issue through empirical evidence and use it as motivation to design a hybrid distillation technique. We introduce a novel algorithm, Hybrid Knowledge Distillation for Robust and Accurate FL (HYDRA-FL), which reduces the impact of attacks in attack scenarios by offloading some of the KD loss to a shallow layer via an auxiliary classifier. We model HYDRA-FL as a generic framework and adapt it to two KD-based FL algorithms, FedNTD and MOON. Using these two as case studies, we demonstrate that our technique outperforms baselines in attack settings while maintaining comparable performance in benign settings.

contrastive · MoDELS · Learning · 語言模型化 · 對比學習 ·

2024 年 9 月 30 日

RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Shuhao Chen,Weisen Jiang,Baijiong Lin,James T. Kwok,Yu Zhang

from arxiv, Accepted by NeurIPS 2024

Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. However, existing routing models are ineffective when multiple LLMs perform well for a query. To address this problem, in this paper, we propose a method called query-based Router by Dual Contrastive learning (RouterDC). The RouterDC model consists of an encoder and LLM embeddings, and we propose two contrastive learning losses to train the RouterDC model. Experimental results show that RouterDC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution (+2.76\%) and out-of-distribution (+1.90\%) tasks. Source code is available at //github.com/shuhao02/RouterDC.

Prompt · MoDELS · 數據集 · 語言模型化 · 值域 ·

2024 年 9 月 29 日

GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks

Rongchang Li,Minjie Chen,Chang Hu,Han Chen,Wenpeng Xing,Meng Han

Large Language Models (LLMs) like GPT-4, LLaMA, and Qwen have demonstrated remarkable success across a wide range of applications. However, these models remain inherently vulnerable to prompt injection attacks, which can bypass existing safety mechanisms, highlighting the urgent need for more robust attack detection methods and comprehensive evaluation benchmarks. To address these challenges, we introduce GenTel-Safe, a unified framework that includes a novel prompt injection attack detection method, GenTel-Shield, along with a comprehensive evaluation benchmark, GenTel-Bench, which compromises 84812 prompt injection attacks, spanning 3 major categories and 28 security scenarios. To prove the effectiveness of GenTel-Shield, we evaluate it together with vanilla safety guardrails against the GenTel-Bench dataset. Empirically, GenTel-Shield can achieve state-of-the-art attack detection success rates, which reveals the critical weakness of existing safeguarding techniques against harmful prompts. For reproducibility, we have made the code and benchmarking dataset available on the project page at //gentellab.github.io/gentel-safe.github.io/.

Agent · 大語言模型 · Cognition · 語言模型化 · entity ·

2024 年 9 月 27 日

Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

Xuan Liu,Jie Zhang,Song Guo,Haoyang Shang,Chengxu Yang,Quanyan Zhu

Large language models (LLMs) have been shown to face hallucination issues due to the data they trained on often containing human bias; whether this is reflected in the decision-making process of LLM Agents remains under-explored. As LLM Agents are increasingly employed in intricate social environments, a pressing and natural question emerges: Can we utilize LLM Agents' systematic hallucinations to mirror human cognitive biases, thus exhibiting irrational social intelligence? In this paper, we probe the irrational behavior among contemporary LLM Agents by melding practical social science experiments with theoretical insights. Specifically, We propose CogMir, an open-ended Multi-LLM Agents framework that utilizes hallucination properties to assess and enhance LLM Agents' social intelligence through cognitive biases. Experimental results on CogMir subsets show that LLM Agents and humans exhibit high consistency in irrational and prosocial decision-making under uncertain conditions, underscoring the prosociality of LLM Agents as social entities and highlighting the significance of hallucination properties. Additionally, the CogMir framework demonstrates its potential as a valuable platform for encouraging more research into the social intelligence of LLM Agents.

知識 (knowledge) · MoDELS · 語言模型化 · PromptKD · 蒸餾 ·

2024 年 9 月 27 日

PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning

Gyeongman Kim,Doohyuk Jang,Eunho Yang

from arxiv, EMNLP 2024 Findings. Our project page: //promptkd.github.io

Recent advancements in large language models (LLMs) have raised concerns about inference costs, increasing the need for research into model compression. While knowledge distillation (KD) is a prominent method for this, research on KD for generative language models like LLMs is relatively sparse, and the approach of distilling student-friendly knowledge, which has shown promising performance in KD for classification models, remains unexplored in generative language models. To explore this approach, we propose PromptKD, a simple yet effective method that utilizes prompt tuning - for the first time in KD - to enable generative language models to transfer student-friendly knowledge. Unlike previous works in classification that require fine-tuning the entire teacher model for extracting student-friendly knowledge, PromptKD achieves similar effects by adding a small number of prompt tokens and tuning only the prompt with student guidance. Extensive experiments on instruction-following datasets show that PromptKD achieves state-of-the-art performance while adding only 0.0007% of the teacher's parameters as prompts. Further analysis suggests that distilling student-friendly knowledge alleviates exposure bias effectively throughout the entire training process, leading to performance enhancements.

MoDELS · 語言模型化 · Performer · ACID · 大語言模型 ·

2024 年 9 月 27 日

SciDFM: A Large Language Model with Mixture-of-Experts for Science

Liangtai Sun,Danyu Luo,Da Ma,Zihan Zhao,Baocai Chen,Zhennan Shen,Su Zhu,Lu Chen,Xin Chen,Kai Yu

from arxiv, 12 pages, 1 figure, 9 tables. Technical Report, Under Review

Recently, there has been a significant upsurge of interest in leveraging large language models (LLMs) to assist scientific discovery. However, most LLMs only focus on general science, while they lack domain-specific knowledge, such as chemical molecules and amino acid sequences. To bridge these gaps, we introduce SciDFM, a mixture-of-experts LLM, which is trained from scratch and is able to conduct college-level scientific reasoning and understand molecules and amino acid sequences. We collect a large-scale training corpus containing numerous scientific papers and books from different disciplines as well as data from domain-specific databases. We further fine-tune the pre-trained model on lots of instruction data to improve performances on downstream benchmarks. From experiment results, we show that SciDFM achieves strong performance on general scientific benchmarks such as SciEval and SciQ, and it reaches a SOTA performance on domain-specific benchmarks among models of similar size. We further analyze the expert layers and show that the results of expert selection vary with data from different disciplines. To benefit the broader research community, we open-source SciDFM at //huggingface.co/OpenDFM/SciDFM-MoE-A5.6B-v1.0.

Prompt · MoDELS · TOOLS · Continuity · INTERACT ·

2023 年 11 月 21 日

Prompting Frameworks for Large Language Models: A Survey

Xiaoxia Liu,Jingyi Wang,Jun Sun,Xiaohan Yuan,Guoliang Dong,Peng Di,Wenhai Wang,Dongxia Wang

Since the launch of ChatGPT, a powerful AI Chatbot developed by OpenAI, large language models (LLMs) have made significant advancements in both academia and industry, bringing about a fundamental engineering paradigm shift in many areas. While LLMs are powerful, it is also crucial to best use their power where "prompt'' plays a core role. However, the booming LLMs themselves, including excellent APIs like ChatGPT, have several inherent limitations: 1) temporal lag of training data, and 2) the lack of physical capabilities to perform external actions. Recently, we have observed the trend of utilizing prompt-based tools to better utilize the power of LLMs for downstream tasks, but a lack of systematic literature and standardized terminology, partly due to the rapid evolution of this field. Therefore, in this work, we survey related prompting tools and promote the concept of the "Prompting Framework" (PF), i.e. the framework for managing, simplifying, and facilitating interaction with large language models. We define the lifecycle of the PF as a hierarchical structure, from bottom to top, namely: Data Level, Base Level, Execute Level, and Service Level. We also systematically depict the overall landscape of the emerging PF field and discuss potential future research and challenges. To continuously track the developments in this area, we maintain a repository at //github.com/lxx0628/Prompting-Framework-Survey, which can be a useful resource sharing platform for both academic and industry in this field.

語言模型化 · Performer · Agent · MoDELS · Learning ·

2023 年 5 月 19 日

Introspective Tips: Large Language Model for In-Context Decision Making

Liting Chen,Lu Wang,Hang Dong,Yali Du,Jie Yan,Fangkai Yang,Shuang Li,Pu Zhao,Si Qin,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang

from arxiv, 22 pages, 4 figures

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.