爱琴海论坛视频播放三免费-动漫AV观看网站不卡无码

Large Language Models (LLMs) have a natural role in answering complex queries about data streams, but the high computational cost of LLM inference makes them infeasible in many such tasks. We propose online cascade learning, the first approach to addressing this challenge. The objective here is to learn a "cascade" of models, starting with lower-capacity models (such as logistic regressors) and ending with a powerful LLM, along with a deferral policy that determines the model that is used on a given input. We formulate the task of learning cascades online as an imitation-learning problem and give a no-regret algorithm for the problem. Experimental results across four benchmarks show that our method parallels LLMs in accuracy while cutting down inference costs by as much as 90%, underscoring its efficacy and adaptability in stream processing.

相關內容

級聯

關注 0

層 · Learning · MoDELS · 聯邦學習 · AIM ·

2024 年 3 月 20 日

Exploring the Privacy-Energy Consumption Tradeoff for Split Federated Learning

Joohyung Lee,Mohamed Seif,Jungchan Cho,H. Vincent Poor

from arxiv, 10 pages, 5 figures

Split Federated Learning (SFL) has recently emerged as a promising distributed learning technology, leveraging the strengths of both federated and split learning. It emphasizes the advantages of rapid convergence while addressing privacy concerns. As a result, this innovation has received significant attention from both industry and academia. However, since the model is split at a specific layer, known as a cut layer, into both client-side and server-side models for the SFL, the choice of the cut layer in SFL can have a substantial impact on the energy consumption of clients and their privacy, as it influences the training burden and the output of the client-side models. In this article, we provide a comprehensive overview of the SFL process and thoroughly analyze energy consumption and privacy. This analysis considers the influence of various system parameters on the cut layer selection strategy. Additionally, we provide an illustrative example of the cut layer selection, aiming to minimize clients' risk of reconstructing the raw data at the server while sustaining energy consumption within the required energy budget, which involves trade-offs. Finally, we address open challenges in this field. These directions represent promising avenues for future research and development.

Processing（編程語言） · 估計/估計量 · 統計量 · Networking · 訓練數據 ·

2024 年 3 月 19 日

Regularised Spectral Estimation for High Dimensional Point Processes

Carla Pinkney,Carolina Euan,Alex Gibberd,Ali Shojaie

Advances in modern technology have enabled the simultaneous recording of neural spiking activity, which statistically can be represented by a multivariate point process. We characterise the second order structure of this process via the spectral density matrix, a frequency domain equivalent of the covariance matrix. In the context of neuronal analysis, statistics based on the spectral density matrix can be used to infer connectivity in the brain network between individual neurons. However, the high-dimensional nature of spike train data mean that it is often difficult, or at times impossible, to compute these statistics. In this work, we discuss the importance of regularisation-based methods for spectral estimation, and propose novel methodology for use in the point process setting. We establish asymptotic properties for our proposed estimators and evaluate their performance on synthetic data simulated from multivariate Hawkes processes. Finally, we apply our methodology to neuroscience spike train data in order to illustrate its ability to infer connectivity in the brain network.

Performer · 分解的 · 矩陣乘積 · 可約的 · AIM ·

2024 年 3 月 19 日

Enhanced Scalability in Assessing Quantum Integer Factorization Performance

Junseo Lee

from arxiv, 13 pages, 3 figures, 2 tables

With the advancement of quantum technologies, there is a potential threat to traditional encryption systems based on integer factorization. Therefore, developing techniques for accurately measuring the performance of associated quantum algorithms is crucial, as it can provide insights into the practical feasibility from the current perspective. In this chapter, we aim to analyze the time required for integer factorization tasks using Shor's algorithm within a gate-based quantum circuit simulator of the matrix product state type. Additionally, we observe the impact of parameter pre-selection in Shor's algorithm. Specifically, this pre-selection is expected to increase the success rate of integer factorization by reducing the number of iterations and facilitating performance measurement under fixed conditions, thus enabling scalable performance evaluation even on real quantum hardware.

大語言模型 · Agent · TEAM · Learning · Extensibility ·

2024 年 3 月 19 日

Embodied LLM Agents Learn to Cooperate in Organized Teams

Xudong Guo,Kaixuan Huang,Jiale Liu,Wenhui Fan,Natalia Vélez,Qingyun Wu,Huazheng Wang,Thomas L. Griffiths,Mengdi Wang

Large Language Models (LLMs) have emerged as integral tools for reasoning, planning, and decision-making, drawing upon their extensive world knowledge and proficiency in language-related tasks. LLMs thus hold tremendous potential for natural language interaction within multi-agent systems to foster cooperation. However, LLM agents tend to over-report and comply with any instruction, which may result in information redundancy and confusion in multi-agent cooperation. Inspired by human organizations, this paper introduces a framework that imposes prompt-based organization structures on LLM agents to mitigate these problems. Through a series of experiments with embodied LLM agents and human-agent collaboration, our results highlight the impact of designated leadership on team efficiency, shedding light on the leadership qualities displayed by LLM agents and their spontaneous cooperative behaviors. Further, we harness the potential of LLMs to propose enhanced organizational prompts, via a Criticize-Reflect process, resulting in novel organization structures that reduce communication costs and enhance team efficiency.

多峰值 · INFORMS · Extensibility · Integration · 目標函數 ·

2024 年 3 月 19 日

An Aligning and Training Framework for Multimodal Recommendations

Yifan Liu,Kangning Zhang,Xiangyuan Ren,Yanhua Huang,Jiarui Jin,Yingjie Qin,Ruilong Su,Ruiwen Xu,Weinan Zhang

from arxiv, 11 pages

With the development of multimedia applications, multimodal recommendations are playing an essential role, as they can leverage rich contexts beyond user interactions. Existing methods mainly regard multimodal information as an auxiliary, using them to help learn ID features; however, there exist semantic gaps among multimodal content features and ID features, for which directly using multimodal information as an auxiliary would lead to misalignment in representations of users and items. In this paper, we first systematically investigate the misalignment issue in multimodal recommendations, and propose a solution named AlignRec. In AlignRec, the recommendation objective is decomposed into three alignments, namely alignment within contents, alignment between content and categorical ID, and alignment between users and items. Each alignment is characterized by a specific objective function and is integrated into our multimodal recommendation framework. To effectively train our AlignRec, we propose starting from pre-training the first alignment to obtain unified multimodal features and subsequently training the following two alignments together with these features as input. As it is essential to analyze whether each multimodal feature helps in training, we design three new classes of metrics to evaluate intermediate performance. Our extensive experiments on three real-world datasets consistently verify the superiority of AlignRec compared to nine baselines. We also find that the multimodal features generated by AlignRec are better than currently used ones, which are to be open-sourced.

分解的 · 散度 · MoDELS · 方陣 · 層 ·

2024 年 3 月 18 日

Deep Nonnegative Matrix Factorization with Beta Divergences

Valentin Leplat,Le Thi Khanh Hien,Akwum Onwunta,Nicolas Gillis

from arxiv, 34 pages. We have improved the presentation of the paper, corrected a few typoes, and added the MU for beta=1/2. Accepted in Neural Computation

Deep Nonnegative Matrix Factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse datasets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that $\beta$-divergences offer a more suitable alternative. In this paper, we develop new models and algorithms for deep NMF using some $\beta$-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.

圖 · 標注 · Learning · MoDELS · 訓練實例 ·

2024 年 3 月 18 日

Graph Partial Label Learning with Potential Cause Discovering

Hang Gao,Jiaguo Yuan,Jiangmeng Li,Chengyu Yao,Fengge Wu,Junsuo Zhao,Changwen Zheng

Graph Neural Networks (GNNs) have gained considerable attention for their potential in addressing challenges posed by complex graph-structured data in diverse domains. However, accurately annotating graph data for training is difficult due to the inherent complexity and interconnectedness of graphs. To tackle this issue, we propose a novel graph representation learning method that enables GNN models to effectively learn discriminative information even in the presence of noisy labels within the context of Partially Labeled Learning (PLL). PLL is a critical weakly supervised learning problem, where each training instance is associated with a set of candidate labels, including both the true label and additional noisy labels. Our approach leverages potential cause extraction to obtain graph data that exhibit a higher likelihood of possessing a causal relationship with the labels. By incorporating auxiliary training based on the extracted graph data, our model can effectively filter out the noise contained in the labels. We support the rationale behind our approach with a series of theoretical analyses. Moreover, we conduct extensive evaluations and ablation studies on multiple datasets, demonstrating the superiority of our proposed method.

樣本 · 分離的 · 相似度 · 知識神經元 · UniFormer ·

2024 年 3 月 16 日

Unconditional Quantum Advantage for Sampling with Shallow Circuits

Adam Bene Watts,Natalie Parham

from arxiv, 54 pages, 12 figures. The new version improves the result with proofs in the appendices, and includes a new proof overview in the introduction

Recent work by Bravyi, Gosset, and Koenig showed that there exists a search problem that a constant-depth quantum circuit can solve, but that any constant-depth classical circuit with bounded fan-in cannot. They also pose the question: Can we achieve a similar proof of separation for an input-independent sampling task? In this paper, we show that the answer to this question is yes when the number of random input bits given to the classical circuit is bounded. We introduce a distribution $D_{n}$ over $\{0,1\}^n$ and construct a constant-depth uniform quantum circuit family $\{C_n\}_n$ such that $C_n$ samples from a distribution close to $D_{n}$ in total variation distance. For any $\delta < 1$ we also prove, unconditionally, that any classical circuit with bounded fan-in gates that takes as input $kn + n^\delta$ i.i.d. Bernouli random variables with entropy $1/k$ and produces output close to $D_{n}$ in total variation distance has depth $\Omega(\log \log n)$. This gives an unconditional proof that constant-depth quantum circuits can sample from distributions that can't be reproduced by constant-depth bounded fan-in classical circuits, even up to additive error. We also show a similar separation between constant-depth quantum circuits with advice and classical circuits with bounded fan-in and fan-out, but access to an unbounded number of i.i.d random inputs. The distribution $D_n$ and classical circuit lower bounds are inspired by work of Viola, in which he shows a different (but related) distribution cannot be sampled from approximately by constant-depth bounded fan-in classical circuits.

簇 · 圖 · SC · 圖形處理器 · 匯聚 ·

2020 年 6 月 3 日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Filippo Maria Bianchi,Daniele Grattarola,Cesare Alippi

Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.

entity · 圖 · 知識圖譜 · Extensibility · MoDELS ·

2018 年 11 月 12 日

Explainable Reasoning over Knowledge Graphs for Recommendation

Xiang Wang,Dingxian Wang,Canran Xu,Xiangnan He,Yixin Cao,Tat-Seng Chua

from arxiv, 8 pages, 5 figures, AAAI-2019

Incorporating knowledge graph into recommender systems has attracted increasing attention in recent years. By exploring the interlinks within a knowledge graph, the connectivity between users and items can be discovered as paths, which provide rich and complementary information to user-item interactions. Such connectivity not only reveals the semantics of entities and relations, but also helps to comprehend a user's interest. However, existing efforts have not fully explored this connectivity to infer user preferences, especially in terms of modeling the sequential dependencies within and holistic semantics of a path. In this paper, we contribute a new model named Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for recommendation. KPRN can generate path representations by composing the semantics of both entities and relations. By leveraging the sequential dependencies within a path, we allow effective reasoning on paths to infer the underlying rationale of a user-item interaction. Furthermore, we design a new weighted pooling operation to discriminate the strengths of different paths in connecting a user with an item, endowing our model with a certain level of explainability. We conduct extensive experiments on two datasets about movie and music, demonstrating significant improvements over state-of-the-art solutions Collaborative Knowledge Base Embedding and Neural Factorization Machine.