两个人的电影全免费观看720,美女自拍理论视频

Hector A. Gonzalez,Jiaxin Huang,Florian Kelber,Khaleelulla Khan Nazeer,Tim Langer,Chen Liu,Matthias Lohrmann,Amirhossein Rostami,Mark Sch?ne,Bernhard Vogginger,Timo C. Wunderlich,Yexin Yan,Mahmoud Akl,Christian Mayr

from arxiv, Submitted at the Workshop on Machine Learning with New Compute Paradigms at NeurIPS 2023 (MLNPCP 2023)

The joint progress of artificial neural networks (ANNs) and domain specific hardware accelerators such as GPUs and TPUs took over many domains of machine learning research. This development is accompanied by a rapid growth of the required computational demands for larger models and more data. Concurrently, emerging properties of foundation models such as in-context learning drive new opportunities for machine learning applications. However, the computational cost of such applications is a limiting factor of the technology in data centers, and more importantly in mobile devices and edge systems. To mediate the energy footprint and non-trivial latency of contemporary systems, neuromorphic computing systems deeply integrate computational principles of neurobiological systems by leveraging low-power analog and digital technologies. SpiNNaker2 is a digital neuromorphic chip developed for scalable machine learning. The event-based and asynchronous design of SpiNNaker2 allows the composition of large-scale systems involving thousands of chips. This work features the operating principles of SpiNNaker2 systems, outlining the prototype of novel machine learning applications. These applications range from ANNs over bio-inspired spiking neural networks to generalized event-based neural networks. With the successful development and deployment of SpiNNaker2, we aim to facilitate the advancement of event-based and asynchronous algorithms for future generations of machine learning systems.

相關內容

Machine Learning

關注 2240

機器(qi)學(xue)(xue)習(xi)(xi)(xi)(xi)（Machine Learning）是一個(ge)研(yan)(yan)究(jiu)計(ji)算(suan)學(xue)(xue)習(xi)(xi)(xi)(xi)方(fang)(fang)(fang)法(fa)的(de)(de)(de)國(guo)際論(lun)(lun)壇。該雜(za)志發表(biao)文章(zhang)，報告廣泛的(de)(de)(de)學(xue)(xue)習(xi)(xi)(xi)(xi)方(fang)(fang)(fang)法(fa)應(ying)(ying)(ying)用(yong)于各種學(xue)(xue)習(xi)(xi)(xi)(xi)問(wen)(wen)題的(de)(de)(de)實質性(xing)結果。該雜(za)志的(de)(de)(de)特(te)色論(lun)(lun)文描述研(yan)(yan)究(jiu)的(de)(de)(de)問(wen)(wen)題和方(fang)(fang)(fang)法(fa)，應(ying)(ying)(ying)用(yong)研(yan)(yan)究(jiu)和研(yan)(yan)究(jiu)方(fang)(fang)(fang)法(fa)的(de)(de)(de)問(wen)(wen)題。有(you)關學(xue)(xue)習(xi)(xi)(xi)(xi)問(wen)(wen)題或方(fang)(fang)(fang)法(fa)的(de)(de)(de)論(lun)(lun)文通(tong)過實證(zheng)研(yan)(yan)究(jiu)、理論(lun)(lun)分(fen)析(xi)或與心理現象(xiang)的(de)(de)(de)比較提(ti)供了(le)堅(jian)實的(de)(de)(de)支(zhi)持(chi)。應(ying)(ying)(ying)用(yong)論(lun)(lun)文展示(shi)了(le)如何應(ying)(ying)(ying)用(yong)學(xue)(xue)習(xi)(xi)(xi)(xi)方(fang)(fang)(fang)法(fa)來(lai)解決重要的(de)(de)(de)應(ying)(ying)(ying)用(yong)問(wen)(wen)題。研(yan)(yan)究(jiu)方(fang)(fang)(fang)法(fa)論(lun)(lun)文改進了(le)機器(qi)學(xue)(xue)習(xi)(xi)(xi)(xi)的(de)(de)(de)研(yan)(yan)究(jiu)方(fang)(fang)(fang)法(fa)。所有(you)的(de)(de)(de)論(lun)(lun)文都以其他研(yan)(yan)究(jiu)人員可以驗證(zheng)或復制的(de)(de)(de)方(fang)(fang)(fang)式(shi)描述了(le)支(zhi)持(chi)證(zheng)據。論(lun)(lun)文還(huan)詳細(xi)說明了(le)學(xue)(xue)習(xi)(xi)(xi)(xi)的(de)(de)(de)組成部分(fen)，并討(tao)論(lun)(lun)了(le)關于知識表(biao)示(shi)和性(xing)能任(ren)務(wu)的(de)(de)(de)假(jia)設。官(guan)網地址(zhi)：

控制器 · Networking · 泛函 · massive MIMO · QoS ·

2024 年 2 月 22 日

Joint AP-UE Association and Power Factor Optimization for Distributed Massive MIMO

Mohd Saif Ali Khan,Samar Agnihotri,Karthik R. M

The uplink sum-throughput of distributed massive multiple-input-multiple-output (mMIMO) networks depends majorly on Access point (AP)-User Equipment (UE) association and power control. The AP-UE association and power control both are important problems in their own right in distributed mMIMO networks to improve scalability and reduce front-haul load of the network, and to enhance the system performance by mitigating the interference and boosting the desired signals, respectively. Unlike previous studies, which focused primarily on addressing the AP-UE association or power control problems separately, this work addresses the uplink sum-throughput maximization problem in distributed mMIMO networks by solving the joint AP-UE association and power control problem, while maintaining Quality-of-Service (QoS) requirements for each UE. To improve scalability, we present an l1-penalty function that delicately balances the trade-off between spectral efficiency (SE) and front-haul signaling load. Our proposed methodology leverages fractional programming, Lagrangian dual formation, and penalty functions to provide an elegant and effective iterative solution with guaranteed convergence while meeting strict QoS criteria. Extensive numerical simulations validate the efficacy of the proposed technique for maximizing sum-throughput while considering the joint AP-UE association and power control problem, demonstrating its superiority over approaches that address these problems individually. Furthermore, the results show that the introduced penalty function can help us effectively control the maximum front-haul load for uplink distributed mMIMO systems.

自動問答 · 數據集 · 可約的 · Performer · state-of-the-art ·

2024 年 2 月 22 日

PolQA: Polish Question Answering Dataset

Piotr Rybak,Piotr Przyby?a,Maciej Ogrodniczuk

Recently proposed systems for open-domain question answering (OpenQA) require large amounts of training data to achieve state-of-the-art performance. However, data annotation is known to be time-consuming and therefore expensive to acquire. As a result, the appropriate datasets are available only for a handful of languages (mainly English and Chinese). In this work, we introduce and publicly release PolQA, the first Polish dataset for OpenQA. It consists of 7,000 questions, 87,525 manually labeled evidence passages, and a corpus of over 7,097,322 candidate passages. Each question is classified according to its formulation, type, as well as entity type of the answer. This resource allows us to evaluate the impact of different annotation choices on the performance of the QA system and propose an efficient annotation strategy that increases the passage retrieval accuracy@10 by 10.55 p.p. while reducing the annotation cost by 82%.

優化器 · Networking · Neural Networks · 圖形處理器 · 核化 ·

2024 年 2 月 22 日

MaxK-GNN: Towards Theoretical Speed Limits for Accelerating Graph Neural Networks Training

Hongwu Peng,Xi Xie,Kaustubh Shivdikar,MD Amit Hasan,Jiahui Zhao,Shaoyi Huang,Omer Khan,David Kaeli,Caiwen Ding

from arxiv, ASPLOS 2024 accepted publication

In the acceleration of deep neural network training, the GPU has become the mainstream platform. GPUs face substantial challenges on GNNs, such as workload imbalance and memory access irregularities, leading to underutilized hardware. Existing solutions such as PyG, DGL with cuSPARSE, and GNNAdvisor frameworks partially address these challenges but memory traffic is still significant. We argue that drastic performance improvements can only be achieved by the vertical optimization of algorithm and system innovations, rather than treating the speedup optimization as an "after-thought" (i.e., (i) given a GNN algorithm, designing an accelerator, or (ii) given hardware, mainly optimizing the GNN algorithm). In this paper, we present MaxK-GNN, an advanced high-performance GPU training system integrating algorithm and system innovation. (i) We introduce the MaxK nonlinearity and provide a theoretical analysis of MaxK nonlinearity as a universal approximator, and present the Compressed Balanced Sparse Row (CBSR) format, designed to store the data and index of the feature matrix after nonlinearity; (ii) We design a coalescing enhanced forward computation with row-wise product-based SpGEMM Kernel using CBSR for input feature matrix fetching and strategic placement of a sparse output accumulation buffer in shared memory; (iii) We develop an optimized backward computation with outer product-based and SSpMM Kernel. We conduct extensive evaluations of MaxK-GNN and report the end-to-end system run-time. Experiments show that MaxK-GNN system could approach the theoretical speedup limit according to Amdahl's law. We achieve comparable accuracy to SOTA GNNs, but at a significantly increased speed: 3.22/4.24 times speedup (vs. theoretical limits, 5.52/7.27 times) on Reddit compared to DGL and GNNAdvisor implementations.

NLP · 可約的 · LLaMA · GPT-3.5 · 評論員 ·

2024 年 2 月 21 日

SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization

Prakamya Mishra,Zonghai Yao,Parth Vashisht,Feiyun Ouyang,Beining Wang,Vidhi Dhaval Mody,Hong Yu

from arxiv, Equal contribution for the first two authors. arXiv admin note: text overlap with arXiv:2310.20033

Large Language Models (LLMs) such as GPT and Llama have demonstrated significant achievements in summarization tasks but struggle with factual inaccuracies, a critical issue in clinical NLP applications where errors could lead to serious consequences. To counter the high costs and limited availability of expert-annotated data for factual alignment, this study introduces an innovative pipeline that utilizes GPT-3.5 and GPT-4 to generate high-quality feedback aimed at enhancing factual consistency in clinical note summarization. Our research primarily focuses on edit feedback, mirroring the practical scenario in which medical professionals refine AI system outputs without the need for additional annotations. Despite GPT's proven expertise in various clinical NLP tasks, such as the Medical Licensing Examination, there is scant research on its capacity to deliver expert-level edit feedback for improving weaker LMs or LLMs generation quality. This work leverages GPT's advanced capabilities in clinical NLP to offer expert-level edit feedback. Through the use of two distinct alignment algorithms (DPO and SALT) based on GPT edit feedback, our goal is to reduce hallucinations and align closely with medical facts, endeavoring to narrow the divide between AI-generated content and factual accuracy. This highlights the substantial potential of GPT edits in enhancing the alignment of clinical factuality.

LORA · Agent · INTERACT · Performer · 任務對話系統 ·

2024 年 2 月 21 日

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

Xiaoyan Yu,Tongxu Luo,Yifan Wei,Fangyu Lei,Yiming Huang,Peng Hao,Liehuang Zhu

Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse characters. Our framework breaks down the role-playing process into agent pre-training, multiple characters playing, and character incremental learning, effectively handling both seen and unseen roles. This dynamic approach, coupled with distinct LoRA blocks for each character, enhances Neeko's adaptability to unique attributes, personalities, and speaking patterns. As a result, Neeko demonstrates superior performance in MCRP over most existing methods, offering more engaging and versatile user interaction experiences. Code and data are available at //github.com/weiyifan1023/Neeko.

MoDELS · Integration · Prompt · Pivotal（公司） · Performer ·

2024 年 2 月 21 日

BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models

Xueliang Zhao,Xinting Huang,Tingchen Fu,Qintong Li,Shansan Gong,Lemao Liu,Wei Bi,Lingpeng Kong

from arxiv, Preprint

Multimodal reasoning stands as a pivotal capability for large vision-language models (LVLMs). The integration with Domain-Specific Languages (DSL), offering precise visual representations, equips these models with the opportunity to execute more accurate reasoning in complex and professional domains. However, the vanilla Chain-of-Thought (CoT) prompting method faces challenges in effectively leveraging the unique strengths of visual and DSL representations, primarily due to their differing reasoning mechanisms. Additionally, it often falls short in addressing critical steps in multi-step reasoning tasks. To mitigate these challenges, we introduce the \underline{B}i-Modal \underline{B}ehavioral \underline{A}lignment (BBA) prompting method, designed to maximize the potential of DSL in augmenting complex multi-modal reasoning tasks. This method initiates by guiding LVLMs to create separate reasoning chains for visual and DSL representations. Subsequently, it aligns these chains by addressing any inconsistencies, thus achieving a cohesive integration of behaviors from different modalities. Our experiments demonstrate that BBA substantially improves the performance of GPT-4V(ision) on geometry problem solving ($28.34\% \to 34.22\%$), chess positional advantage prediction ($42.08\% \to 46.99\%$) and molecular property prediction ($77.47\% \to 83.52\%$).

圖 · SimPLe · 數據集 · Performer · Networking ·

2024 年 2 月 20 日

Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs

Zehui Li,Xiangyu Zhao,Mingzhu Shen,Guy-Bart Stan,Pietro Liò,Yiren Zhao

from arxiv, 16 pages, 5 figures, 11 tables

Graphs are widely used to encapsulate a variety of data formats, but real-world networks often involve complex node relations beyond only being pairwise. While hypergraphs and hierarchical graphs have been developed and employed to account for the complex node relations, they cannot fully represent these complexities in practice. Additionally, though many Graph Neural Networks (GNNs) have been proposed for representation learning on higher-order graphs, they are usually only evaluated on simple graph datasets. Therefore, there is a need for a unified modelling of higher-order graphs, and a collection of comprehensive datasets with an accessible evaluation framework to fully understand the performance of these algorithms on complex graphs. In this paper, we introduce the concept of hybrid graphs, a unified definition for higher-order graphs, and present the Hybrid Graph Benchmark (HGB). HGB contains 23 real-world hybrid graph datasets across various domains such as biology, social media, and e-commerce. Furthermore, we provide an extensible evaluation framework and a supporting codebase to facilitate the training and evaluation of GNNs on HGB. Our empirical study of existing GNNs on HGB reveals various research opportunities and gaps, including (1) evaluating the actual performance improvement of hypergraph GNNs over simple graph GNNs; (2) comparing the impact of different sampling strategies on hybrid graph learning methods; and (3) exploring ways to integrate simple graph and hypergraph information. We make our source code and full datasets publicly available at //zehui127.github.io/hybrid-graph-benchmark/.

圖像字幕 · 優化器 · MoDELS · 可辨認的 · 差分進化 ·

2024 年 2 月 20 日

AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization

Jiyao Li,Mingze Ni,Yifei Dong,Tianqing Zhu,Wei Liu

Recent advances in deep learning research have shown remarkable achievements across many tasks in computer vision (CV) and natural language processing (NLP). At the intersection of CV and NLP is the problem of image captioning, where the related models' robustness against adversarial attacks has not been well studied. In this paper, we present a novel adversarial attack strategy, which we call AICAttack (Attention-based Image Captioning Attack), designed to attack image captioning models through subtle perturbations on images. Operating within a black-box attack scenario, our algorithm requires no access to the target model's architecture, parameters, or gradient information. We introduce an attention-based candidate selection mechanism that identifies the optimal pixels to attack, followed by Differential Evolution (DE) for perturbing pixels' RGB values. We demonstrate AICAttack's effectiveness through extensive experiments on benchmark datasets with multiple victim models. The experimental results demonstrate that our method surpasses current leading-edge techniques by effectively distributing the alignment and semantics of words in the output.

cache · 語言模型化 · Weight · 大語言模型 · MoDELS ·

2024 年 2 月 20 日

WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More

Yuxuan Yue,Zhihang Yuan,Haojie Duanmu,Sifan Zhou,Jianlong Wu,Liqiang Nie

from arxiv, Frist work to exclusively quantize weight and Key/Value cache for large language models

Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process. This paper addresses these challenges by focusing on the quantization of LLMs, a technique that reduces memory consumption by converting model parameters and activations into low-bit integers. We critically analyze the existing quantization approaches, identifying their limitations in balancing the accuracy and efficiency of the quantized LLMs. To advance beyond these limitations, we propose WKVQuant, a PTQ framework especially designed for quantizing weights and the key/value (KV) cache of LLMs. Specifically, we incorporates past-only quantization to improve the computation of attention. Additionally, we introduce two-dimensional quantization strategy to handle the distribution of KV cache, along with a cross-block reconstruction regularization for parameter optimization. Experiments show that WKVQuant achieves almost comparable memory savings to weight-activation quantization, while also approaching the performance of weight-only quantization.

大語言模型 · 語言模型化 · MoDELS · 情景 · 評論員 ·

2024 年 2 月 20 日

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

Xiangru Tang,Anni Zou,Zhuosheng Zhang,Ziming Li,Yilun Zhao,Xingyao Zhang,Arman Cohan,Mark Gerstein

Large language models (LLMs), despite their remarkable progress across various general domains, encounter significant barriers in medicine and healthcare. This field faces unique challenges such as domain-specific terminologies and reasoning over specialized knowledge. To address these issues, we propose a novel Multi-disciplinary Collaboration (MC) framework for the medical domain that leverages LLM-based agents in a role-playing setting that participate in a collaborative multi-round discussion, thereby enhancing LLM proficiency and reasoning capabilities. This training-free framework encompasses five critical steps: gathering domain experts, proposing individual analyses, summarising these analyses into a report, iterating over discussions until a consensus is reached, and ultimately making a decision. Our work focuses on the zero-shot setting, which is applicable in real-world scenarios. Experimental results on nine datasets (MedQA, MedMCQA, PubMedQA, and six subtasks from MMLU) establish that our proposed MC framework excels at mining and harnessing the medical expertise within LLMs, as well as extending its reasoning abilities. Our code can be found at \url{//github.com/gersteinlab/MedAgents}.