苹果电影在线观看免费高清,国产肥熟女一区二区三区

Zhihang Yuan,Yuzhang Shang,Yang Zhou,Zhen Dong,Zhe Zhou,Chenhao Xue,Bingzhe Wu,Zhikai Li,Qingyi Gu,Yong Jae Lee,Yan Yan,Beidi Chen,Guangyu Sun,Kurt Keutzer

The field of efficient Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Although the field has expanded and is vibrant, there hasn't been a concise framework that analyzes the various methods of LLM Inference to provide a clear understanding of this domain. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems, such as why LLMs are memory-bound, how much memory and computation they need, and how to choose the right hardware. We systematically collate the latest advancements in efficient LLM inference, covering crucial areas such as model compression (e.g., Knowledge Distillation and Quantization), algorithm improvements (e.g., Early Exit and Mixture-of-Expert), and both hardware and system-level enhancements. Our survey stands out by analyzing these methods with roofline model, helping us understand their impact on memory access and computation. This distinctive approach not only showcases the current research landscape but also delivers valuable insights for practical implementation, positioning our work as an indispensable resource for researchers new to the field as well as for those seeking to deepen their understanding of efficient LLM deployment. The analyze tool, LLM-Viewer, is open-sourced.

相關內容

大語言模型

關注 55

大語言模型是基于海量文本數據訓練的深度學習模型。它不僅能夠生成自然語言文本，還能夠深入理解文本含義，處理各種自然語言任務，如文本摘要、問答、翻譯等。2023年，大語言模型及其在人工智能領域的應用已成為全球科技研究的熱點，其在規模上的增長尤為引人注目，參數量已從最初的十幾億躍升到如今的一萬億。參數量的提升使得模型能夠更加精細地捕捉人類語言微妙之處，更加深入地理解人類語言的復雜性。在過去的一年里，大語言模型在吸納新知識、分解復雜任務以及圖文對齊等多方面都有顯著提升。隨著技術的不斷成熟，它將不斷拓展其應用范圍，為人類提供更加智能化和個性化的服務，進一步改善人們的生活和生產方式。

多峰值 · MoDELS · Performer · 任務對話系統 · HTTPS ·

2024 年 6 月 13 日

AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models

Yuhang Wu,Wenmeng Yu,Yean Cheng,Yan Wang,Xiaohan Zhang,Jiazheng Xu,Ming Ding,Yuxiao Dong

Evaluating the alignment capabilities of large Vision-Language Models (VLMs) is essential for determining their effectiveness as helpful assistants. However, existing benchmarks primarily focus on basic abilities using nonverbal methods, such as yes-no and multiple-choice questions. In this paper, we address this gap by introducing AlignMMBench, a comprehensive alignment benchmark specifically designed for emerging Chinese VLMs. This benchmark is meticulously curated from real-world scenarios and Chinese Internet sources, encompassing thirteen specific tasks across three categories, and includes both single-turn and multi-turn dialogue scenarios. Incorporating a prompt rewrite strategy, AlignMMBench encompasses 1,054 images and 4,978 question-answer pairs. To facilitate the evaluation pipeline, we propose CritiqueVLM, a rule-calibrated evaluator that exceeds GPT-4's evaluation ability. Finally, we report the performance of representative VLMs on AlignMMBench, offering insights into the capabilities and limitations of different VLM architectures. All evaluation codes and data are available on //alignmmbench.github.io.

MoDELS · Weight · IR · CASES · Performer ·

2024 年 6 月 12 日

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Benjamin Biggs,Arjun Seshadri,Yang Zou,Achin Jain,Aditya Golatkar,Yusheng Xie,Alessandro Achille,Ashwin Swaminathan,Stefano Soatto

We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup samples from a point in weight space that approximates the geometric mean of the distributions of constituent datasets, which offers anti-memorization guarantees and enables zero-shot style mixing. Empirically, Diffusion Soup outperforms a paragon model trained on the union of all data shards and achieves a 30% improvement in Image Reward (.34 $\to$ .44) on domain sharded data, and a 59% improvement in IR (.37 $\to$ .59) on aesthetic data. In both cases, souping also prevails in TIFA score (respectively, 85.5 $\to$ 86.5 and 85.6 $\to$ 86.8). We demonstrate robust unlearning -- removing any individual domain shard only lowers performance by 1% in IR (.45 $\to$ .44) -- and validate our theoretical insights on anti-memorization using real data. Finally, we showcase Diffusion Soup's ability to blend the distinct styles of models finetuned on different shards, resulting in the zero-shot generation of hybrid styles.

泛函 · Learning · INFORMS · contrastive · 代碼 ·

2024 年 6 月 12 日

Statement-Level Vulnerability Detection: Learning Vulnerability Patterns Through Information Theory and Contrastive Learning

Van Nguyen,Trung Le,Chakkrit Tantithamthavorn,Michael Fu,John Grundy,Hung Nguyen,Seyit Camtepe,Paul Quirk,Dinh Phung

Software vulnerabilities are a serious and crucial concern. Typically, in a program or function consisting of hundreds or thousands of source code statements, there are only a few statements causing the corresponding vulnerabilities. Most current approaches to vulnerability labelling are done on a function or program level by experts with the assistance of machine learning tools. Extending this approach to the code statement level is much more costly and time-consuming and remains an open problem. In this paper, we propose a novel end-to-end deep learning-based approach to identify the vulnerability-relevant code statements of a specific function. Inspired by the specific structures observed in real-world vulnerable code, we first leverage mutual information for learning a set of latent variables representing the relevance of the source code statements to the corresponding function's vulnerability. We then propose novel clustered spatial contrastive learning in order to further improve the representation learning and the robust selection process of vulnerability-relevant code statements. Experimental results on real-world datasets of 200k+ C/C++ functions show the superiority of our method over other state-of-the-art baselines. In general, our method obtains a higher performance in VCP, VCA, and Top-10 ACC measures of between 3% to 14% over the baselines when running on real-world datasets in an unsupervised setting. Our released source code samples are publicly available at \href{//github.com/vannguyennd/livuitcl}{//github.com/vannguyennd/livuitcl.}

Performer · Networking · 優化器 · Performance · 縮放 ·

2024 年 6 月 11 日

Cooperative ISAC Networks: Performance Analysis, Scaling Laws and Optimization

Kaitao Meng,Christos Masouros,Athina P. Petropulu,Lajos Hanzo

from arxiv, 13 pages, 10 figures, this work has been submitted to IEEE for possible publication. arXiv admin note: text overlap with arXiv:2403.20228

Integrated sensing and communication (ISAC) networks are investigated with the objective of effectively balancing the sensing and communication (S&C) performance at the network level. Through the simultaneous utilization of multi-point (CoMP) coordinated joint transmission and distributed multiple-input multiple-output (MIMO) radar techniques, we propose an innovative networked ISAC scheme, where multiple transceivers are employed for collaboratively enhancing the S&C services. Then, the potent tool of stochastic geometry is exploited for characterizing the S&C performance, which allows us to illuminate the key cooperative dependencies in the ISAC network and optimize salient network-level parameters. Remarkably, the Cramer-Rao lower bound (CRLB) expression of the localization accuracy derived unveils a significant finding: Deploying N ISAC transceivers yields an enhanced average cooperative sensing performance across the entire network, in accordance with the ln^2N scaling law. Crucially, this scaling law is less pronounced in comparison to the performance enhancement of N^2 achieved when the transceivers are equidistant from the target, which is primarily due to the substantial path loss from the distant base stations (BSs) and leads to reduced contributions to sensing performance gain. Moreover, we derive a tight expression of the communication rate, and present a low-complexity algorithm to determine the optimal cooperative cluster size. Based on our expression derived for the S&C performance, we formulate the optimization problem of maximizing the network performance in terms of two joint S&C metrics. To this end, we jointly optimize the cooperative BS cluster sizes and the transmit power to strike a flexible tradeoff between the S&C performance.

多峰值 · MoDELS · 語言模型化 · 大語言模型 · Notability ·

2024 年 6 月 9 日

LLMs Meet Multimodal Generation and Editing: A Survey

Yingqing He,Zhaoyang Liu,Jingye Chen,Zeyue Tian,Hongyu Liu,Xiaowei Chi,Runtao Liu,Ruibin Yuan,Yazhou Xing,Wenhai Wang,Jifeng Dai,Yong Zhang,Wei Xue,Qifeng Liu,Yike Guo,Qifeng Chen

from arxiv, 52 Pages with 16 Figures, 12 Tables, and 545 References. GitHub Repository at: //github.com/YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

With the recent advancement in large language models (LLMs), there is a growing interest in combining LLMs with multimodal learning. Previous surveys of multimodal large language models (MLLMs) mainly focus on multimodal understanding. This survey elaborates on multimodal generation and editing across various domains, comprising image, video, 3D, and audio. Specifically, we summarize the notable advancements with milestone works in these fields and categorize these studies into LLM-based and CLIP/T5-based methods. Then, we summarize the various roles of LLMs in multimodal generation and exhaustively investigate the critical technical components behind these methods and the multimodal datasets utilized in these studies. Additionally, we dig into tool-augmented multimodal agents that can leverage existing generative models for human-computer interaction. Lastly, we discuss the advancements in the generative AI safety field, investigate emerging applications, and discuss future prospects. Our work provides a systematic and insightful overview of multimodal generation and processing, which is expected to advance the development of Artificial Intelligence for Generative Content (AIGC) and world models. A curated list of all related papers can be found at //github.com/YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

鏈路預測 · Networking · 圖形處理器 · MoDELS · Neural Networks ·

2024 年 6 月 7 日

GENIE: Watermarking Graph Neural Networks for Link Prediction

Venkata Sai Pranav Bachina,Ankit Gangwal,Aaryan Ajay Sharma,Charu Sharma

from arxiv, 20 pages, 12 figures

Graph Neural Networks (GNNs) have advanced the field of machine learning by utilizing graph-structured data, which is ubiquitous in the real world. GNNs have applications in various fields, ranging from social network analysis to drug discovery. GNN training is strenuous, requiring significant computational resources and human expertise. It makes a trained GNN an indispensable Intellectual Property (IP) for its owner. Recent studies have shown GNNs to be vulnerable to model-stealing attacks, which raises concerns over IP rights protection. Watermarking has been shown to be effective at protecting the IP of a GNN model. Existing efforts to develop a watermarking scheme for GNNs have only focused on the node classification and the graph classification tasks. To the best of our knowledge, we introduce the first-ever watermarking scheme for GNNs tailored to the Link Prediction (LP) task. We call our proposed watermarking scheme GENIE (watermarking Graph nEural Networks for lInk prEdiction). We design GENIE using a novel backdoor attack to create a trigger set for two key methods of LP: (1) node representation-based and (2) subgraph-based. In GENIE, the watermark is embedded into the GNN model by training it on both the trigger set and a modified training set, resulting in a watermarked GNN model. To assess a suspect model, we verify the watermark against the trigger set. We extensively evaluate GENIE across 3 model architectures (i.e., SEAL, GCN, and GraphSAGE) and 7 real-world datasets. Furthermore, we validate the robustness of GENIE against 11 state-of-the-art watermark removal techniques and 3 model extraction attacks. We also demonstrate that GENIE is robust against ownership piracy attack. Our ownership demonstration scheme statistically guarantees both False Positive Rate (FPR) and False Negative Rate (FNR) to be less than $10^{-6}$.

穩健性 · FAST · 秩 · 優化器 · 近似 ·

2024 年 6 月 6 日

Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition

Yijun Dong,Chao Chen,Per-Gunnar Martinsson,Katherine Pearce

The interpolative decomposition (ID) aims to construct a low-rank approximation formed by a basis consisting of row/column skeletons in the original matrix and a corresponding interpolation matrix. This work explores fast and accurate ID algorithms from five essential perspectives for empirical performance: (a) skeleton complexity that measures the minimum possible ID rank for a given low-rank approximation error, (b) asymptotic complexity in FLOPs, (c) parallelizability of the computational bottleneck as matrix-matrix multiplications, (d) error-revealing property that enables automatic rank detection for given error tolerances without prior knowledge of target ranks, (e) ID-revealing property that ensures efficient construction of the optimal interpolation matrix after selecting the skeletons. While a broad spectrum of algorithms have been developed to optimize parts of the aforementioned perspectives, practical ID algorithms proficient in all perspectives remain absent. To fill in the gap, we introduce robust blockwise random pivoting (RBRP) that is parallelizable, error-revealing, and exactly ID-revealing, with comparable skeleton and asymptotic complexities to the best existing ID algorithms in practice. Through extensive numerical experiments on various synthetic and natural datasets, we demonstrate the appealing empirical performance of RBRP from the five perspectives above, as well as the robustness of RBRP to adversarial inputs.

可理解性 · 可辨認的 · TOOLS · state-of-the-art · HTTPS ·

2024 年 6 月 5 日

Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey

Bowen Jiang,Yangxinyu Xie,Xiaomeng Wang,Weijie J. Su,Camillo J. Taylor,Tanwi Mallick

Rationality is the quality of being guided by reason, characterized by logical thinking and decision-making that align with evidence and logical rules. This quality is essential for effective problem-solving, as it ensures that solutions are well-founded and systematically derived. Despite the advancements of large language models (LLMs) in generating human-like text with remarkable accuracy, they present biases inherited from the training data, inconsistency across different contexts, and difficulty understanding complex scenarios involving multiple layers of context. Therefore, recent research attempts to leverage the strength of multiple agents working collaboratively with various types of data and tools for enhanced consistency and reliability. To that end, this paper aims to understand whether multi-modal and multi-agent systems are advancing toward rationality by surveying the state-of-the-art works, identifying advancements over single-agent and single-modal systems in terms of rationality, and discussing open problems and future directions. We maintain an open repository at //github.com/bowen-upenn/MMMA_Rationality.

圖 · 知識圖譜 · 知識表示 · Machine Learning · Processing（編程語言） ·

2021 年 12 月 31 日

What is Event Knowledge Graph: A Survey

Saiping Guan,Xueqi Cheng,Long Bai,Fujun Zhang,Zixuan Li,Yutao Zeng,Xiaolong Jin,Jiafeng Guo

Besides entity-centric knowledge, usually organized as Knowledge Graph (KG), events are also an essential kind of knowledge in the world, which trigger the spring up of event-centric knowledge representation form like Event KG (EKG). It plays an increasingly important role in many machine learning and artificial intelligence applications, such as intelligent search, question-answering, recommendation, and text generation. This paper provides a comprehensive survey of EKG from history, ontology, instance, and application views. Specifically, to characterize EKG thoroughly, we focus on its history, definitions, schema induction, acquisition, related representative graphs/systems, and applications. The development processes and trends are studied therein. We further summarize perspective directions to facilitate future research on EKG.

學成 · 深度學習 · 可辨認的 · MoDELS · 目標跟蹤 ·

2019 年 7 月 31 日

Deep Learning in Video Multi-Object Tracking: A Survey

Gioele Ciaparrone,Francisco Luque Sánchez,Siham Tabik,Luigi Troiano,Roberto Tagliaferri,Francisco Herrera

from arxiv, New in v2: corrected typos and various minor mistakes. Submitted to Neurocomputing. Main text: 25 pages, 5 figures, 6 tables. Summary table in appendix at the end of the paper

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.