Digital acquisition of high bandwidth signals is particularly challenging when Nyquist rate sampling is impractical. This has led to extensive research in sub-Nyquist sampling methods, primarily for spectral and sinusoidal frequency estimation. However, these methods struggle with high-dynamic-range (HDR) signals that can saturate analog-to-digital converters (ADCs). Addressing this, we introduce a novel sub-Nyquist spectral estimation method, driven by the Unlimited Sensing Framework (USF), utilizing a multi-channel system. The sub-Nyquist USF method aliases samples in both amplitude and frequency domains, rendering the inverse problem particularly challenging. Towards this goal, our exact recovery theorem establishes that $K$ sinusoids of arbitrary amplitudes and frequencies can be recovered from $6K + 4$ modulo samples, remarkably, independent of the sampling rate or folding threshold. In the true spirit of sub-Nyquist sampling, via modulo ADC hardware experiments, we demonstrate successful spectrum estimation of HDR signals in the kHz range using Hz range sampling rates (0.078\% Nyquist rate). Our experiments also reveal up to a 33-fold improvement in frequency estimation accuracy using one less bit compared to conventional ADCs. These findings open new avenues in spectral estimation applications, e.g., radars, direction-of-arrival (DoA) estimation, and cognitive radio, showcasing the potential of USF.
Generative diffusion models (GDMs) have recently shown great success in synthesizing multimedia signals with high perceptual quality enabling highly efficient semantic communications in future wireless networks. In this paper, we develop an intent-aware generative semantic multicasting framework utilizing pre-trained diffusion models. In the proposed framework, the transmitter decomposes the source signal to multiple semantic classes based on the multi-user intent, i.e. each user is assumed to be interested in details of only a subset of the semantic classes. The transmitter then sends to each user only its intended classes, and multicasts a highly compressed semantic map to all users over shared wireless resources that allows them to locally synthesize the other classes, i.e. non-intended classes, utilizing pre-trained diffusion models. The signal retrieved at each user is thereby partially reconstructed and partially synthesized utilizing the received semantic map. This improves utilization of the wireless resources, with better preserving privacy of the non-intended classes. We design a communication/computation-aware scheme for per-class adaptation of the communication parameters, such as the transmission power and compression rate to minimize the total latency of retrieving signals at multiple receivers, tailored to the prevailing channel conditions as well as the users reconstruction/synthesis distortion/perception requirements. The simulation results demonstrate significantly reduced per-user latency compared with non-generative and intent-unaware multicasting benchmarks while maintaining high perceptual quality of the signals retrieved at the users.
A non-uniform implicit-explicit L1 mixed finite element method (IMEX-L1-MFEM) is investigated for a class of time-fractional partial integro-differential equations (PIDEs) with space-time dependent coefficients and non-self-adjoint elliptic part. The proposed fully discrete method combines an IMEX-L1 method on a graded mesh in the temporal variable with a mixed finite element method in spatial variables. The focus of the study is to analyze stability results and to establish optimal error estimates, up to a logarithmic factor, for both the solution and the flux in $L^2$-norm when the initial data $u_0\in H_0^1(\Omega)\cap H^2(\Omega)$. Additionally, an error estimate in $L^\infty$-norm is derived for 2D problems. All the derived estimates and bounds in this article remain valid as $\alpha\to 1^{-}$, where $\alpha$ is the order of the Caputo fractional derivative. Finally, the results of several numerical experiments conducted at the end of this paper are confirming our theoretical findings.
In this research, we introduce an algorithm that produces what appears to be a new mathematical object as a consequence of projecting the \( n \)-dimensional \( Z \)-curve onto an \( n \)-dimensional sphere. The first part presents the algorithm that enables this transformation, and the second part focuses on studying its properties.
In previous work, Edelman and Dumitriu provide a description of the result of applying the Householder tridiagonalization algorithm to a G$\beta$E random matrix. The resulting tridiagonal ensemble makes sense for all $\beta>0$, and has spectrum given by the $\beta$-ensemble for all $\beta>0$. Moreover, the tridiagonal model has useful stochastic operator limits introduced and analyzed by Edelman and Sutton, and subsequently analyzed in work by Ramirez, Rider, and Vir\'ag. In this work we analogously study the result of applying the Householder tridiagonalization algorithm to a G$\beta$E process which has eigenvalues governed by $\beta$-Dyson Brownian motion. We propose an explicit limit of the upper left $k \times k$ minor of the $n \times n$ tridiagonal process as $n \to \infty$ and $k$ remains fixed. We prove the result for $\beta=1$, and also provide numerical evidence for $\beta=1,2,4$. This leads us to conjecture the form of a dynamical $\beta$-stochastic Airy operator with smallest $k$ eigenvalues evolving according to the $n \to \infty$ limit of the largest, centered and re-scaled, $k$ eigenvalues of $\beta$-Dyson Brownian motion.
Mixture-of-Experts (MoE) has emerged as a practical approach to scale up parameters for the Transformer model to achieve better generalization while maintaining a sub-linear increase in computation overhead. Current MoE models are mainly built with expert parallelism on distributed devices. However, it usually depends on homogeneous devices to deploy and suffers from heavy communication overhead and computation redundancy. In this paper, we explore developing a \texttt{H}eterogeneous-aware \texttt{EX}pert \texttt{A}llocation framework, \textbf{\texttt{HEXA-MoE}}, with significantly enhanced computing efficiency. It contains two components: ($1$) \textit{Expert-Specific Operators}. We replace the typical general matrix multiplication or grouped matrix multiplication interfaces with our operators, which allows the computing to be performed in an in-place manner with \textbf{ZERO} redundancy. ($2$) \textit{Adaptive Data- and Model-Centric Configurations} for different workload scales. Specifically, we introduce a pipeline-shared cache on each device to tackle the heavy memory consumption in the existing data-centric MoE library. Comprehensive experiments on the Swin-MoE benchmark consistently reveal the effectiveness of our \texttt{HEXA-MoE} framework, \textit{i.e.}, reducing $10\%\sim48\%$ memory consumption and achieving $0.5\sim4.3\times$ speed up compared to current state-of-the-art MoE libraries. Furthermore, we examine our \texttt{HEXA-MoE} with heterogeneous devices for both data- and model-centric settings. Promising results show that employing optimal parallel configuration with \texttt{HEXA-MoE} on heterogeneous devices can substantially minimize overall latency. Codes are available at \href{//github.com/UNITES-Lab/HEXA-MoE}{\underline{here}}.
The fairness of clustering algorithms has gained widespread attention across various areas, including machine learning, In this paper, we study fair $k$-means clustering in Euclidean space. Given a dataset comprising several groups, the fairness constraint requires that each cluster should contain a proportion of points from each group within specified lower and upper bounds. Due to these fairness constraints, determining the optimal locations of $k$ centers is a quite challenging task. We propose a novel ``Relax and Merge'' framework that returns a $(1+4\rho + O(\epsilon))$-approximate solution, where $\rho$ is the approximate ratio of an off-the-shelf vanilla $k$-means algorithm and $O(\epsilon)$ can be an arbitrarily small positive number. If equipped with a PTAS of $k$-means, our solution can achieve an approximation ratio of $(5+O(\epsilon))$ with only a slight violation of the fairness constraints, which improves the current state-of-the-art approximation guarantee. Furthermore, using our framework, we can also obtain a $(1+4\rho +O(\epsilon))$-approximate solution for the $k$-sparse Wasserstein Barycenter problem, which is a fundamental optimization problem in the field of optimal transport, and a $(2+6\rho)$-approximate solution for the strictly fair $k$-means clustering with no violation, both of which are better than the current state-of-the-art methods. In addition, the empirical results demonstrate that our proposed algorithm can significantly outperform baseline approaches in terms of clustering cost.
This paper employs a localized orthogonal decomposition (LOD) method with $H^1$ interpolation for solving the multiscale elliptic problem. This method does not need any assumptions on scale separation. We give a priori error estimate for the proposed method. The theoretical results are conformed by various numerical experiments.
Large Language Models (LLMs) have shown remarkable reasoning capabilities on complex tasks, but they still suffer from out-of-date knowledge, hallucinations, and opaque decision-making. In contrast, Knowledge Graphs (KGs) can provide explicit and editable knowledge for LLMs to alleviate these issues. Existing paradigm of KG-augmented LLM manually predefines the breadth of exploration space and requires flawless navigation in KGs. However, this paradigm cannot adaptively explore reasoning paths in KGs based on the question semantics and self-correct erroneous reasoning paths, resulting in a bottleneck in efficiency and effect. To address these limitations, we propose a novel self-correcting adaptive planning paradigm for KG-augmented LLM named Plan-on-Graph (PoG), which first decomposes the question into several sub-objectives and then repeats the process of adaptively exploring reasoning paths, updating memory, and reflecting on the need to self-correct erroneous reasoning paths until arriving at the answer. Specifically, three important mechanisms of Guidance, Memory, and Reflection are designed to work together, to guarantee the adaptive breadth of self-correcting planning for graph reasoning. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness and efficiency of PoG.
Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.
Autonomous driving has achieved a significant milestone in research and development over the last decade. There is increasing interest in the field as the deployment of self-operating vehicles on roads promises safer and more ecologically friendly transportation systems. With the rise of computationally powerful artificial intelligence (AI) techniques, autonomous vehicles can sense their environment with high precision, make safe real-time decisions, and operate more reliably without human interventions. However, intelligent decision-making in autonomous cars is not generally understandable by humans in the current state of the art, and such deficiency hinders this technology from being socially acceptable. Hence, aside from making safe real-time decisions, the AI systems of autonomous vehicles also need to explain how these decisions are constructed in order to be regulatory compliant across many jurisdictions. Our study sheds a comprehensive light on developing explainable artificial intelligence (XAI) approaches for autonomous vehicles. In particular, we make the following contributions. First, we provide a thorough overview of the present gaps with respect to explanations in the state-of-the-art autonomous vehicle industry. We then show the taxonomy of explanations and explanation receivers in this field. Thirdly, we propose a framework for an architecture of end-to-end autonomous driving systems and justify the role of XAI in both debugging and regulating such systems. Finally, as future research directions, we provide a field guide on XAI approaches for autonomous driving that can improve operational safety and transparency towards achieving public approval by regulators, manufacturers, and all engaged stakeholders.