This paper proposes a distributed learning-based framework to tackle the sum ergodic rate maximization problem in cell-free massive multiple-input multiple-output (MIMO) systems by utilizing the graph neural network (GNN). Different from centralized schemes, which gather all the channel state information (CSI) at the central processing unit (CPU) for calculating the resource allocation, the local resource of access points (APs) is exploited in the proposed distributed GNN-based framework to allocate transmit powers. Specifically, APs can use a unique GNN model to allocate their power based on the local CSI. The GNN model is trained at the CPU using the local CSI of one AP, with partially exchanged information from other APs to calculate the loss function to reflect system characteristics, capturing comprehensive network information while avoiding computation burden. Numerical results show that the proposed distributed learning-based approach achieves a sum ergodic rate close to that of centralized learning while outperforming the model-based optimization.
This paper studies decentralized bilevel optimization, in which multiple agents collaborate to solve problems involving nested optimization structures with neighborhood communications. Most existing literature primarily utilizes gradient tracking to mitigate the influence of data heterogeneity, without exploring other well-known heterogeneity-correction techniques such as EXTRA or Exact Diffusion. Additionally, these studies often employ identical decentralized strategies for both upper- and lower-level problems, neglecting to leverage distinct mechanisms across different levels. To address these limitations, this paper proposes SPARKLE, a unified Single-loop Primal-dual AlgoRithm frameworK for decentraLized bilEvel optimization. SPARKLE offers the flexibility to incorporate various heterogeneitycorrection strategies into the algorithm. Moreover, SPARKLE allows for different strategies to solve upper- and lower-level problems. We present a unified convergence analysis for SPARKLE, applicable to all its variants, with state-of-the-art convergence rates compared to existing decentralized bilevel algorithms. Our results further reveal that EXTRA and Exact Diffusion are more suitable for decentralized bilevel optimization, and using mixed strategies in bilevel algorithms brings more benefits than relying solely on gradient tracking.
This paper proposes a novel federated algorithm that leverages momentum-based variance reduction with adaptive learning to address non-convex settings across heterogeneous data. We intend to minimize communication and computation overhead, thereby fostering a sustainable federated learning system. We aim to overcome challenges related to gradient variance, which hinders the model's efficiency, and the slow convergence resulting from learning rate adjustments with heterogeneous data. The experimental results on the image classification tasks with heterogeneous data reveal the effectiveness of our suggested algorithms in non-convex settings with an improved communication complexity of $\mathcal{O}(\epsilon^{-1})$ to converge to an $\epsilon$-stationary point - compared to the existing communication complexity $\mathcal{O}(\epsilon^{-2})$ of most prior works. The proposed federated version maintains the trade-off between the convergence rate, number of communication rounds, and test accuracy while mitigating the client drift in heterogeneous settings. The experimental results demonstrate the efficiency of our algorithms in image classification tasks (MNIST, CIFAR-10) with heterogeneous data.
This study examines the effectiveness of traditional machine learning classifiers versus deep learning models for detecting the imagined speech using electroencephalogram data. Specifically, we evaluated conventional machine learning techniques such as CSP-SVM and LDA-SVM classifiers alongside deep learning architectures such as EEGNet, ShallowConvNet, and DeepConvNet. Machine learning classifiers exhibited significantly lower precision and recall, indicating limited feature extraction capabilities and poor generalization between imagined speech and idle states. In contrast, deep learning models, particularly EEGNet, achieved the highest accuracy of 0.7080 and an F1 score of 0.6718, demonstrating their enhanced ability in automatic feature extraction and representation learning, essential for capturing complex neurophysiological patterns. These findings highlight the limitations of conventional machine learning approaches in brain-computer interface (BCI) applications and advocate for adopting deep learning methodologies to achieve more precise and reliable classification of detecting imagined speech. This foundational research contributes to the development of imagined speech-based BCI systems.
This paper presents a simplified weak Galerkin (WG) finite element method for solving biharmonic equations avoiding the use of traditional stabilizers. The proposed WG method supports both convex and non-convex polytopal elements in finite element partitions, utilizing bubble functions as a critical analytical tool. The simplified WG method is symmetric and positive definite. Optimal-order error estimates are established for WG approximations in both the discrete $H^2$ norm and the $L^2$ norm.
This paper presents a novel approach to enhance communication efficiency in federated learning through clipped uniform quantization. By leveraging optimal clipping thresholds and client-specific adaptive quantization schemes, the proposed method significantly reduces bandwidth and memory requirements for model weight transmission between clients and the server while maintaining competitive accuracy. We investigate the effects of symmetric clipping and uniform quantization on model performance, emphasizing the role of stochastic quantization in mitigating artifacts and improving robustness. Extensive simulations demonstrate that the method achieves near-full-precision performance with substantial communication savings. Moreover, the proposed approach facilitates efficient weight averaging based on the inverse of the mean squared quantization errors, effectively balancing the trade-off between communication efficiency and model accuracy. Moreover, in contrast to federated averaging, this design obviates the need to disclose client-specific data volumes to the server, thereby enhancing client privacy. Comparative analysis with conventional quantization methods further confirms the efficacy of the proposed scheme.
The ability of large language models (LLMs) to transform, interpret, and comprehend vast quantities of heterogeneous data presents a significant opportunity to enhance data-driven care delivery. However, the sensitive nature of protected health information (PHI) raises valid concerns about data privacy and trust in remote LLM platforms. In addition, the cost associated with cloud-based artificial intelligence (AI) services continues to impede widespread adoption. To address these challenges, we propose a shift in the LLM execution environment from opaque, centralized cloud providers to a decentralized and dynamic fog computing architecture. By executing open-weight LLMs in more trusted environments, such as the user's edge device or a fog layer within a local network, we aim to mitigate the privacy, trust, and financial challenges associated with cloud-based LLMs. We further present SpeziLLM, an open-source framework designed to facilitate rapid and seamless leveraging of different LLM execution layers and lowering barriers to LLM integration in digital health applications. We demonstrate SpeziLLM's broad applicability across six digital health applications, showcasing its versatility in various healthcare settings.
This paper introduces a novel approach to create a high-resolution "map" for physics learning: an "atomic" learning objectives (LOs) system designed to capture detailed cognitive processes and concepts required for problem solving in a college-level introductory physics course. Our method leverages Large Language Models (LLMs) for automated labeling of physics questions and introduces a comprehensive set of metrics to evaluate the quality of the labeling outcomes. The atomic LO system, covering nine chapters of an introductory physics course, uses a "subject-verb-object'' structure to represent specific cognitive processes. We apply this system to 131 questions from expert-curated question banks and the OpenStax University Physics textbook. Each question is labeled with 1-8 atomic LOs across three chapters. Through extensive experiments using various prompting strategies and LLMs, we compare automated LOs labeling results against human expert labeling. Our analysis reveals both the strengths and limitations of LLMs, providing insight into LLMs reasoning processes for labeling LOs and identifying areas for improvement in LOs system design. Our work contributes to the field of learning analytics by proposing a more granular approach to mapping learning objectives with questions. Our findings have significant implications for the development of intelligent tutoring systems and personalized learning pathways in STEM education, paving the way for more effective "learning GPS'' systems.
Graph representation learning methods are highly effective in handling complex non-Euclidean data by capturing intricate relationships and features within graph structures. However, traditional methods face challenges when dealing with heterogeneous graphs that contain various types of nodes and edges due to the diverse sources and complex nature of the data. Existing Heterogeneous Graph Neural Networks (HGNNs) have shown promising results but require prior knowledge of node and edge types and unified node feature formats, which limits their applicability. Recent advancements in graph representation learning using Large Language Models (LLMs) offer new solutions by integrating LLMs' data processing capabilities, enabling the alignment of various graph representations. Nevertheless, these methods often overlook heterogeneous graph data and require extensive preprocessing. To address these limitations, we propose a novel method that leverages the strengths of both LLM and GNN, allowing for the processing of graph data with any format and type of nodes and edges without the need for type information or special preprocessing. Our method employs LLM to automatically summarize and classify different data formats and types, aligns node features, and uses a specialized GNN for targeted learning, thus obtaining effective graph representations for downstream tasks. Theoretical analysis and experimental validation have demonstrated the effectiveness of our method.
Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This methodology, focusing primarily on the text domain, provides a cost-effective solution to the generation of plausible but incorrect responses by LLMs, thereby enhancing the accuracy and reliability of their outputs through the use of real-world data. As RAG grows in complexity and incorporates multiple concepts that can influence its performance, this paper organizes the RAG paradigm into four categories: pre-retrieval, retrieval, post-retrieval, and generation, offering a detailed perspective from the retrieval viewpoint. It outlines RAG's evolution and discusses the field's progression through the analysis of significant studies. Additionally, the paper introduces evaluation methods for RAG, addressing the challenges faced and proposing future research directions. By offering an organized framework and categorization, the study aims to consolidate existing research on RAG, clarify its technological underpinnings, and highlight its potential to broaden the adaptability and applications of LLMs.
We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.