免费在线黄色电影_国产午夜免费视频_免费观看国产日韩在线_国产又色又爽又黄的精品_一级黄色片在线苴播_亚洲国产亚洲国产丝袜在线不卡_天天躁日日躁久久

We present a novel inference scheme, self-speculative decoding, for accelerating Large Language Models (LLMs) without the need for an auxiliary model. This approach is characterized by a two-stage process: drafting and verification. The drafting stage generates draft tokens at a slightly lower quality but more quickly, which is achieved by selectively skipping certain intermediate layers during drafting. Subsequently, the verification stage employs the original LLM to validate those draft output tokens in one forward pass. This process ensures the final output remains identical to that produced by the unaltered LLM. Moreover, the proposed method requires no additional neural network training and no extra memory footprint, making it a plug-and-play and cost-effective solution for inference acceleration. Benchmarks with LLaMA-2 and its variants demonstrated a speedup up to 1.99$\times$.

相關內容

大語(yu)言模(mo)型

關注 56

大語(yu)(yu)言模(mo)型(xing)(xing)(xing)是基于海量文(wen)本數(shu)據訓練的(de)(de)(de)(de)深度學習模(mo)型(xing)(xing)(xing)。它不(bu)僅能(neng)夠生成自(zi)然語(yu)(yu)言文(wen)本，還能(neng)夠深入(ru)理(li)解文(wen)本含義，處(chu)理(li)各種自(zi)然語(yu)(yu)言任(ren)務(wu)，如(ru)(ru)文(wen)本摘要、問答、翻譯(yi)等(deng)。2023年，大語(yu)(yu)言模(mo)型(xing)(xing)(xing)及(ji)其(qi)在(zai)(zai)人(ren)(ren)工(gong)智能(neng)領域的(de)(de)(de)(de)應用已成為全球科技(ji)研究的(de)(de)(de)(de)熱點，其(qi)在(zai)(zai)規模(mo)上的(de)(de)(de)(de)增長尤為引人(ren)(ren)注目，參(can)(can)數(shu)量已從(cong)最初(chu)的(de)(de)(de)(de)十幾億躍升到如(ru)(ru)今的(de)(de)(de)(de)一(yi)萬億。參(can)(can)數(shu)量的(de)(de)(de)(de)提升使得模(mo)型(xing)(xing)(xing)能(neng)夠更加(jia)精細地(di)捕捉人(ren)(ren)類語(yu)(yu)言微妙之處(chu)，更加(jia)深入(ru)地(di)理(li)解人(ren)(ren)類語(yu)(yu)言的(de)(de)(de)(de)復(fu)(fu)雜(za)性。在(zai)(zai)過去(qu)的(de)(de)(de)(de)一(yi)年里(li)，大語(yu)(yu)言模(mo)型(xing)(xing)(xing)在(zai)(zai)吸納新知(zhi)識、分解復(fu)(fu)雜(za)任(ren)務(wu)以及(ji)圖(tu)文(wen)對(dui)齊等(deng)多方面都(dou)有顯著提升。隨著技(ji)術的(de)(de)(de)(de)不(bu)斷成熟，它將(jiang)不(bu)斷拓展(zhan)其(qi)應用范圍，為人(ren)(ren)類提供更加(jia)智能(neng)化和個性化的(de)(de)(de)(de)服務(wu)，進一(yi)步改善人(ren)(ren)們的(de)(de)(de)(de)生活和生產(chan)方式。

INTERACT · 大語言模型 · 生成式人工智能 · MoDELS · 服務器 ·

2024 年 6 月 30 日

PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games

Steph Buongiorno,Lawrence Jake Klinkert,Tanishq Chawla,Zixin Zhuang,Corey Clark

This research introduces Procedural Artificial Narrative using Generative AI (PANGeA), a structured approach for leveraging large language models (LLMs), guided by a game designer's high-level criteria, to generate narrative content for turn-based role-playing video games (RPGs). Distinct from prior applications of LLMs used for video game design, PANGeA innovates by not only generating game level data (which includes, but is not limited to, setting, key items, and non-playable characters (NPCs)), but by also fostering dynamic, free-form interactions between the player and the environment that align with the procedural game narrative. The NPCs generated by PANGeA are personality-biased and express traits from the Big 5 Personality Model in their generated responses. PANGeA addresses challenges behind ingesting free-form text input, which can prompt LLM responses beyond the scope of the game narrative. A novel validation system that uses the LLM's intelligence evaluates text input and aligns generated responses with the unfolding narrative. Making these interactions possible, PANGeA is supported by a server that hosts a custom memory system that supplies context for augmenting generated responses thus aligning them with the procedural narrative. For its broad application, the server has a REST interface enabling any game engine to integrate directly with PANGeA, as well as an LLM interface adaptable with local or private LLMs. PANGeA's ability to foster dynamic narrative generation by aligning responses with the procedural narrative is demonstrated through an empirical study and ablation test of two versions of a demo game. These are, a custom, browser-based GPT and a Unity demo. As the results show, PANGeA holds potential to assist game designers in using LLMs to generate narrative-consistent content even when provided varied and unpredictable, free-form text input.

MoDELS · 3D · Extensibility · HTTPS · RGB-D ·

2024 年 6 月 28 日

HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model

Hieu T. Nguyen,Yiwen Chen,Vikram Voleti,Varun Jampani,Huaizu Jiang

We introduce HouseCrafter, a novel approach that can lift a floorplan into a complete large 3D indoor scene (e.g., a house). Our key insight is to adapt a 2D diffusion model, which is trained on web-scale images, to generate consistent multi-view color (RGB) and depth (D) images across different locations of the scene. Specifically, the RGB-D images are generated autoregressively in a batch-wise manner along sampled locations based on the floorplan, where previously generated images are used as condition to the diffusion model to produce images at nearby locations. The global floorplan and attention design in the diffusion model ensures the consistency of the generated images, from which a 3D scene can be reconstructed. Through extensive evaluation on the 3D-Front dataset, we demonstrate that HouseCraft can generate high-quality house-scale 3D scenes. Ablation studies also validate the effectiveness of different design choices. We will release our code and model weights. Project page: //neu-vi.github.io/houseCrafter/

有向 · Learning · INTERACT · Extensibility · Processing（編程語言） ·

2024 年 6 月 28 日

HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid

Xinyu Xu,Yizheng Zhang,Yong-Lu Li,Lei Han,Cewu Lu

Physical Human-Scene Interaction (HSI) plays a crucial role in numerous applications. However, existing HSI techniques are limited to specific object dynamics and privileged information, which prevents the development of more comprehensive applications. To address this limitation, we introduce HumanVLA for general object rearrangement directed by practical vision and language. A teacher-student framework is utilized to develop HumanVLA. A state-based teacher policy is trained first using goal-conditioned reinforcement learning and adversarial motion prior. Then, it is distilled into a vision-language-action model via behavior cloning. We propose several key insights to facilitate the large-scale learning process. To support general object rearrangement by physical humanoid, we introduce a novel Human-in-the-Room dataset encompassing various rearrangement tasks. Through extensive experiments and analysis, we demonstrate the effectiveness of the proposed approach.

代碼 · MoDELS · 可辨認的 · 可理解性 · 集成學習 ·

2024 年 6 月 28 日

AuthAttLyzer-V2: Unveiling Code Authorship Attribution using Enhanced Ensemble Learning Models & Generating Benchmark Dataset

Bhaskar Joshi,Sepideh HajiHossein Khani,Arash HabibiLashkari

Source Code Authorship Attribution (SCAA) is crucial for software classification because it provides insights into the origin and behavior of software. By accurately identifying the author or group behind a piece of code, experts can better understand the motivations and techniques of developers. In the cybersecurity era, this attribution helps trace the source of malicious software, identify patterns in the code that may indicate specific threat actors or groups, and ultimately enhance threat intelligence and mitigation strategies. This paper presents AuthAttLyzer-V2, a new source code feature extractor for SCAA, focusing on lexical, semantic, syntactic, and N-gram features. Our research explores author identification in C++ by examining 24,000 source code samples from 3,000 authors. Our methodology integrates Random Forest, Gradient Boosting, and XGBoost models, enhanced with SHAP for interpretability. The study demonstrates how ensemble models can effectively discern individual coding styles, offering insights into the unique attributes of code authorship. This approach is pivotal in understanding and interpreting complex patterns in authorship attribution, especially for malware classification.

tuning · 在線 · 優化器 · 大語言模型 · Principle ·

2024 年 6 月 27 日

AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

Jia Fu,Xiaoting Qin,Fangkai Yang,Lu Wang,Jue Zhang,Qingwei Lin,Yubo Chen,Dongmei Zhang,Saravan Rajmohan,Qi Zhang

Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces. We conduct extensive experiments on tuning hyper-parameters, such as top-k retrieved documents, prompt compression ratio, and embedding methods, using the ALCE-ASQA and Natural Questions datasets. Our evaluation from jointly optimization all three hyper-parameters demonstrate that MAB-based online learning methods can achieve Recall@5 $\approx 0.8$ for scenarios with prominent gradients in search space, using only $\sim20\%$ of the LLM API calls required by the Grid Search approach. Additionally, the proposed Hier-MAB approach outperforms other baselines in more challenging optimization scenarios. The code will be made available at //aka.ms/autorag.

MoDELS · 優化器 · 數據集 · Extensibility · 可約的 ·

2024 年 6 月 27 日

BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring

Luca Benfenati,Thorir Mar Ingolfsson,Andrea Cossettini,Daniele Jahier Pagliari,Alessio Burrello,Luca Benini

from arxiv, 4 pages, 2 tables, 2 figures

This study presents a novel approach for EEG-based seizure detection leveraging a BERT-based model. The model, BENDR, undergoes a two-phase training process. Initially, it is pre-trained on the extensive Temple University Hospital EEG Corpus (TUEG), a 1.5 TB dataset comprising over 10,000 subjects, to extract common EEG data patterns. Subsequently, the model is fine-tuned on the CHB-MIT Scalp EEG Database, consisting of 664 EEG recordings from 24 pediatric patients, of which 198 contain seizure events. Key contributions include optimizing fine-tuning on the CHB-MIT dataset, where the impact of model architecture, pre-processing, and post-processing techniques are thoroughly examined to enhance sensitivity and reduce false positives per hour (FP/h). We also explored custom training strategies to ascertain the most effective setup. The model undergoes a novel second pre-training phase before subject-specific fine-tuning, enhancing its generalization capabilities. The optimized model demonstrates substantial performance enhancements, achieving as low as 0.23 FP/h, 2.5$\times$ lower than the baseline model, with a lower but still acceptable sensitivity rate, showcasing the effectiveness of applying a BERT-based approach on EEG-based seizure detection.

剪枝 · MoDELS · Learning · Performer · Performance ·

2024 年 6 月 27 日

FedMap: Iterative Magnitude-Based Pruning for Communication-Efficient Federated Learning

Alexander Herzog,Robbie Southam,Ioannis Mavromatis,Aftab Khan

from arxiv, Submitted to IEEE Transactions on Neural Networks and Learning Systems

Federated Learning (FL) is a distributed machine learning approach that enables training on decentralized data while preserving privacy. However, FL systems often involve resource-constrained client devices with limited computational power, memory, storage, and bandwidth. This paper introduces FedMap, a novel method that aims to enhance the communication efficiency of FL deployments by collaboratively learning an increasingly sparse global model through iterative, unstructured pruning. Importantly, FedMap trains a global model from scratch, unlike other methods reported in the literature, making it ideal for privacy-critical use cases such as in the medical and finance domains, where suitable pre-training data is often limited. FedMap adapts iterative magnitude-based pruning to the FL setting, ensuring all clients prune and refine the same subset of the global model parameters, therefore gradually reducing the global model size and communication overhead. The iterative nature of FedMap, forming subsequent models as subsets of predecessors, avoids parameter reactivation issues seen in prior work, resulting in stable performance. In this paper we provide an extensive evaluation of FedMap across diverse settings, datasets, model architectures, and hyperparameters, assessing performance in both IID and non-IID environments. Comparative analysis against the baseline approach demonstrates FedMap's ability to achieve more stable client model performance. For IID scenarios, FedMap achieves over $90$\% pruning without significant performance degradation. In non-IID settings, it achieves at least $~80$\% pruning while maintaining accuracy. FedMap offers a promising solution to alleviate communication bottlenecks in FL systems while retaining model accuracy.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

圖 · 知識圖譜 · 鏈路預測 · Extensibility · entity ·

2020 年 10 月 6 日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Tara Safavi,Danai Koutra

from arxiv, EMNLP 2020

We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content, and is a more difficult link prediction benchmark. Data, code, and pretrained models are available at //bit.ly/2EPbrJs.

圖 · 學成 · 知識圖譜 · FreeBASIC · 強化學習 ·

2018 年 1 月 8 日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Wenhan Xiong,Thien Hoang,William Yang Wang

We study the problem of learning to reason in large scale knowledge graphs (KGs). More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector space by sampling the most promising relation to extend its path. In contrast to prior work, our approach includes a reward function that takes the accuracy, diversity, and efficiency into consideration. Experimentally, we show that our proposed method outperforms a path-ranking based algorithm and knowledge graph embedding methods on Freebase and Never-Ending Language Learning datasets.