顾美玲国产一区二区三区_国产一区二区高清无码_日韩黄色视频免费_成人一级黄色大片_日韩一卡2卡3卡4卡新区乱码视频_中文字幕日本黄色视频_波多野结衣AV在线亚洲无码

The inference process in Large Language Models (LLMs) is often limited due to the absence of parallelism in the auto-regressive decoding process, resulting in most operations being restricted by the memory bandwidth of accelerators. While methods such as speculative decoding have been suggested to address this issue, their implementation is impeded by the challenges associated with acquiring and maintaining a separate draft model. In this paper, we present Medusa, an efficient method that augments LLM inference by adding extra decoding heads to predict multiple subsequent tokens in parallel. Using a tree-based attention mechanism, Medusa constructs multiple candidate continuations and verifies them simultaneously in each decoding step. By leveraging parallel processing, Medusa introduces only minimal overhead in terms of single-step latency while substantially reducing the number of decoding steps required. We present two levels of fine-tuning procedures for Medusa to meet the needs of different use cases: Medusa-1: Medusa is directly fine-tuned on top of a frozen backbone LLM, enabling lossless inference acceleration. Medusa-2: Medusa is fine-tuned together with the backbone LLM, enabling better prediction accuracy of Medusa heads and higher speedup but needing a special training recipe that preserves the backbone model's capabilities. Moreover, we propose several extensions that improve or expand the utility of Medusa, including a self-distillation to handle situations where no training data is available and a typical acceptance scheme to boost the acceptance rate while maintaining generation quality. We evaluate Medusa on models of various sizes and training procedures. Our experiments demonstrate that Medusa-1 can achieve over 2.2x speedup without compromising generation quality, while Medusa-2 further improves the speedup to 2.3-3.6x.

相關內容

大語(yu)言(yan)模型

關注 56

大(da)語(yu)言(yan)(yan)模型是基于海量(liang)文(wen)本數(shu)據訓練的(de)(de)深(shen)度學習模型。它(ta)(ta)不僅能夠(gou)生成自(zi)然語(yu)言(yan)(yan)文(wen)本，還能夠(gou)深(shen)入(ru)理(li)解(jie)(jie)文(wen)本含義，處(chu)(chu)理(li)各種自(zi)然語(yu)言(yan)(yan)任(ren)務，如文(wen)本摘要(yao)、問(wen)答、翻(fan)譯等。2023年，大(da)語(yu)言(yan)(yan)模型及其在(zai)(zai)人(ren)(ren)工智(zhi)能領域的(de)(de)應用已(yi)成為(wei)全(quan)球(qiu)科技(ji)研究的(de)(de)熱點，其在(zai)(zai)規模上的(de)(de)增長尤為(wei)引人(ren)(ren)注目，參(can)數(shu)量(liang)已(yi)從最初的(de)(de)十幾億躍升到如今(jin)的(de)(de)一(yi)(yi)萬億。參(can)數(shu)量(liang)的(de)(de)提(ti)升使得(de)模型能夠(gou)更加(jia)精細地捕捉人(ren)(ren)類(lei)(lei)語(yu)言(yan)(yan)微妙(miao)之(zhi)處(chu)(chu)，更加(jia)深(shen)入(ru)地理(li)解(jie)(jie)人(ren)(ren)類(lei)(lei)語(yu)言(yan)(yan)的(de)(de)復雜性(xing)。在(zai)(zai)過去(qu)的(de)(de)一(yi)(yi)年里，大(da)語(yu)言(yan)(yan)模型在(zai)(zai)吸納(na)新知(zhi)識、分解(jie)(jie)復雜任(ren)務以及圖文(wen)對(dui)齊(qi)等多方面都有顯著提(ti)升。隨(sui)著技(ji)術的(de)(de)不斷成熟，它(ta)(ta)將(jiang)不斷拓展其應用范圍，為(wei)人(ren)(ren)類(lei)(lei)提(ti)供更加(jia)智(zhi)能化和個(ge)性(xing)化的(de)(de)服務，進一(yi)(yi)步改善人(ren)(ren)們的(de)(de)生活和生產方式。

Extensibility · 可約的 · 聯邦學習 · Learning · Performer ·

2024 年 3 月 3 日

HyperFedNet: Communication-Efficient Personalized Federated Learning Via Hypernetwork

Xingyun Chen,Yan Huang,Zhenzhen Xie,Junjie Pang

In response to the challenges posed by non-independent and identically distributed (non-IID) data and the escalating threat of privacy attacks in Federated Learning (FL), we introduce HyperFedNet (HFN), a novel architecture that incorporates hypernetworks to revolutionize parameter aggregation and transmission in FL. Traditional FL approaches, characterized by the transmission of extensive parameters, not only incur significant communication overhead but also present vulnerabilities to privacy breaches through gradient analysis. HFN addresses these issues by transmitting a concise set of hypernetwork parameters, thereby reducing communication costs and enhancing privacy protection. Upon deployment, the HFN algorithm enables the dynamic generation of parameters for the basic layer of the FL main network, utilizing local database features quantified by embedding vectors as input. Through extensive experimentation, HFN demonstrates superior performance in reducing communication overhead and improving model accuracy compared to conventional FL methods. By integrating the HFN algorithm into the FL framework, HFN offers a solution to the challenges of non-IID data and privacy threats.

控制器 · MoDELS · state-of-the-art · 查準率/準確率 · 分離的 ·

2024 年 3 月 2 日

TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion

Salaheldin Mohamed

In recent years, significant progress has been made in the development of text- to-image generation models. However, these models still face limitations when it comes to achieving full controllability during the generation process. Often, spe- cific training or the use of limited models is required, and even then, they have certain restrictions. To address these challenges, A two-stage method that effec- tively combines controllability and high quality in the generation of images is proposed. This approach leverages the expertise of pre-trained models to achieve precise control over the generated images, while also harnessing the power of diffusion models to achieve state-of-the-art quality. By separating controllability from high quality, This method achieves outstanding results. It is compatible with both latent and image space diffusion models, ensuring versatility and flexibil- ity. Moreover, This approach consistently produces comparable outcomes to the current state-of-the-art methods in the field. Overall, This proposed method rep- resents a significant advancement in text-to-image generation, enabling improved controllability without compromising on the quality of the generated images.

INFORMS · 小樣本學習 · Performer · 特征提取器 · entity ·

2024 年 3 月 1 日

Few-Shot Relation Extraction with Hybrid Visual Evidence

Jiaying Gong,Hoda Eldardiry

from arxiv, 16 pages, 5 figures

The goal of few-shot relation extraction is to predict relations between name entities in a sentence when only a few labeled instances are available for training. Existing few-shot relation extraction methods focus on uni-modal information such as text only. This reduces performance when there are no clear contexts between the name entities described in text. We propose a multi-modal few-shot relation extraction model (MFS-HVE) that leverages both textual and visual semantic information to learn a multi-modal representation jointly. The MFS-HVE includes semantic feature extractors and multi-modal fusion components. The MFS-HVE semantic feature extractors are developed to extract both textual and visual features. The visual features include global image features and local object features within the image. The MFS-HVE multi-modal fusion unit integrates information from various modalities using image-guided attention, object-guided attention, and hybrid feature attention to fully capture the semantic interaction between visual regions of images and relevant texts. Extensive experiments conducted on two public datasets demonstrate that semantic visual information significantly improves the performance of few-shot relation prediction.

自動問答 · Automator · MoDELS · 大語言模型 · 相關系數 ·

2024 年 3 月 1 日

CFMatch: Aligning Automated Answer Equivalence Evaluation with Expert Judgments For Open-Domain Question Answering

Zongxia Li,Ishani Mondal,Yijun Liang,Huy Nghiem,Jordan Boyd-Graber

from arxiv, See arXiv:2402.11161

Question answering (QA) can only make progress if we know if an answer is correct, but for many of the most challenging and interesting QA examples, current evaluation metrics to determine answer equivalence (AE) often do not align with human judgments, particularly more verbose, free-form answers from large language models (LLM). There are two challenges: a lack of data and that models are too big: LLM-based scorers can correlate better with human judges, but this task has only been tested on limited QA datasets, and even when available, update of the model is limited because LLMs are large and often expensive. We rectify both of these issues by providing clear and consistent guidelines for evaluating AE in machine QA adopted from professional human QA contests. We also introduce a combination of standard evaluation and a more efficient, robust, and lightweight discriminate AE classifier-based matching method (CFMatch, smaller than 1 MB), trained and validated to more accurately evaluate answer correctness in accordance with adopted expert AE rules that are more aligned with human judgments.

大語言模型 · 語言模型化 · MoDELS · Performer · 相關系數 ·

2024 年 3 月 1 日

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Jiandong Jin,Bowen Tang,Mingxuan Ma,Xiao Liu,Yunfei Wang,Qingnan Lai,Jia Yang,Changling Zhou

from arxiv, 9 pages, 7 figures

We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity. By correlating CVEs with MITRE ATT&CK techniques, Crimson advances threat anticipation and strategic defense efforts. Our approach includes defining and evaluating cybersecurity strategic tasks, alongside implementing a comprehensive human-in-the-loop data-synthetic workflow to develop the CVE-to-ATT&CK Mapping (CVEM) dataset. We further enhance LLMs' reasoning abilities through a novel Retrieval-Aware Training (RAT) process and its refined iteration, RAT-R. Our findings demonstrate that an LLM fine-tuned with our techniques, possessing 7 billion parameters, approaches the performance level of GPT-4, showing markedly lower rates of hallucination and errors, and surpassing other models in strategic reasoning tasks. Moreover, domain-specific fine-tuning of embedding models significantly improves performance within cybersecurity contexts, underscoring the efficacy of our methodology. By leveraging Crimson to convert raw vulnerability data into structured and actionable insights, we bolster proactive cybersecurity defenses.

ASAP · Automator · 可行 · 機器人 · 可約的 ·

2024 年 2 月 29 日

ASAP: Automated Sequence Planning for Complex Robotic Assembly with Physical Feasibility

Yunsheng Tian,Karl D. D. Willis,Bassel Al Omari,Jieliang Luo,Pingchuan Ma,Yichen Li,Farhad Javid,Edward Gu,Joshua Jacob,Shinjiro Sueda,Hui Li,Sachin Chitta,Wojciech Matusik

from arxiv, ICRA 2024

The automated assembly of complex products requires a system that can automatically plan a physically feasible sequence of actions for assembling many parts together. In this paper, we present ASAP, a physics-based planning approach for automatically generating such a sequence for general-shaped assemblies. ASAP accounts for gravity to design a sequence where each sub-assembly is physically stable with a limited number of parts being held and a support surface. We apply efficient tree search algorithms to reduce the combinatorial complexity of determining such an assembly sequence. The search can be guided by either geometric heuristics or graph neural networks trained on data with simulation labels. Finally, we show the superior performance of ASAP at generating physically realistic assembly sequence plans on a large dataset of hundreds of complex product assemblies. We further demonstrate the applicability of ASAP on both simulation and real-world robotic setups. Project website: asap.csail.mit.edu

MoDELS · AIM · 評論員 · 語言模型化 · 知識 (knowledge) ·

2022 年 12 月 20 日

Towards Reasoning in Large Language Models: A Survey

Jie Huang,Kevin Chen-Chuan Chang

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

判別器 · 語義相似度 · state-of-the-art · 相似度 · MoDELS ·

2019 年 9 月 15 日

Emu: Enhancing Multilingual Sentence Embeddings with Semantic Specialization

Wataru Hirota,Yoshihiko Suhara,Behzad Golshan,Wang-Chiew Tan

We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework fine-tunes pre-trained multilingual sentence embeddings using two main components: a semantic classifier and a language discriminator. The semantic classifier improves the semantic similarity of related sentences, whereas the language discriminator enhances the multilinguality of the embeddings via multilingual adversarial training. Our experimental results based on several language pairs show that our specialized embeddings outperform the state-of-the-art multilingual sentence embedding model on the task of cross-lingual intent classification using only monolingual labeled data.

估計/估計量 · 正交 · 泛函 · MoDELS · 有偏 ·

2018 年 1 月 20 日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Chaolu Feng

from arxiv, 27 pages, 14 figures

Image segmentation is still an open problem especially when intensities of the interested objects are overlapped due to the presence of intensity inhomogeneity (also known as bias field). To segment images with intensity inhomogeneities, a bias correction embedded level set model is proposed where Inhomogeneities are Estimated by Orthogonal Primary Functions (IEOPF). In the proposed model, the smoothly varying bias is estimated by a linear combination of a given set of orthogonal primary functions. An inhomogeneous intensity clustering energy is then defined and membership functions of the clusters described by the level set function are introduced to rewrite the energy as a data term of the proposed model. Similar to popular level set methods, a regularization term and an arc length term are also included to regularize and smooth the level set function, respectively. The proposed model is then extended to multichannel and multiphase patterns to segment colourful images and images with multiple objects, respectively. It has been extensively tested on both synthetic and real images that are widely used in the literature and public BrainWeb and IBSR datasets. Experimental results and comparison with state-of-the-art methods demonstrate that advantages of the proposed model in terms of bias correction and segmentation accuracy.

自動問答 · MoDELS · Networking · Processing（編程語言） · state-of-the-art ·

2018 年 1 月 15 日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Mantong Zhou,Minlie Huang,Xiaoyan Zhu

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.