Progressing towards a new era of Artificial Intelligence (AI) - enabled wireless networks, concerns regarding the environmental impact of AI have been raised both in industry and academia. Federated Learning (FL) has emerged as a key privacy preserving decentralized AI technique. Despite efforts currently being made in FL, its environmental impact is still an open problem. Targeting the minimization of the overall energy consumption of an FL process, we propose the orchestration of computational and communication resources of the involved devices to minimize the total energy required, while guaranteeing a certain performance of the model. To this end, we propose a Soft Actor Critic Deep Reinforcement Learning (DRL) solution, where a penalty function is introduced during training, penalizing the strategies that violate the constraints of the environment, and contributing towards a safe RL process. A device level synchronization method, along with a computationally cost effective FL environment are proposed, with the goal of further reducing the energy consumption and communication overhead. Evaluation results show the effectiveness and robustness of the proposed scheme compared to four state-of-the-art baseline solutions on different network environments and FL architectures, achieving a decrease of up to 94% in the total energy consumption.
In recent years, the evolution of Telecom towards achieving intelligent, autonomous, and open networks has led to an increasingly complex Telecom Software system, supporting various heterogeneous deployment scenarios, with multi-standard and multi-vendor support. As a result, it becomes a challenge for large-scale Telecom software companies to develop and test software for all deployment scenarios. To address these challenges, we propose a framework for Automated Test Generation for large-scale Telecom Software systems. We begin by generating Test Case Input data for test scenarios observed using a time-series Generative model trained on historical Telecom Network data during field trials. Additionally, the time-series Generative model helps in preserving the privacy of Telecom data. The generated time-series software performance data are then utilized with test descriptions written in natural language to generate Test Script using the Generative Large Language Model. Our comprehensive experiments on public datasets and Telecom datasets obtained from operational Telecom Networks demonstrate that the framework can effectively generate comprehensive test case data input and useful test code.
Deep neural networks (DNNs) have achieved tremendous success in making accurate predictions for computer vision, natural language processing, as well as science and engineering domains. However, it is also well-recognized that DNNs sometimes make unexpected, incorrect, but overconfident predictions. This can cause serious consequences in high-stake applications, such as autonomous driving, medical diagnosis, and disaster response. Uncertainty quantification (UQ) aims to estimate the confidence of DNN predictions beyond prediction accuracy. In recent years, many UQ methods have been developed for DNNs. It is of great practical value to systematically categorize these UQ methods and compare their advantages and disadvantages. However, existing surveys mostly focus on categorizing UQ methodologies from a neural network architecture perspective or a Bayesian perspective and ignore the source of uncertainty that each methodology can incorporate, making it difficult to select an appropriate UQ method in practice. To fill the gap, this paper presents a systematic taxonomy of UQ methods for DNNs based on the types of uncertainty sources (data uncertainty versus model uncertainty). We summarize the advantages and disadvantages of methods in each category. We show how our taxonomy of UQ methodologies can potentially help guide the choice of UQ method in different machine learning problems (e.g., active learning, robustness, and reinforcement learning). We also identify current research gaps and propose several future research directions.
In recent studies, linear recurrent neural networks (LRNNs) have achieved Transformer-level performance in natural language and long-range modeling, while offering rapid parallel training and constant inference cost. With the resurgence of interest in LRNNs, we study whether they can learn the hidden rules in training sequences, such as the grammatical structures of regular language. We theoretically analyze some existing LRNNs and discover their limitations in modeling regular language. Motivated by this analysis, we propose a new LRNN equipped with a block-diagonal and input-dependent transition matrix. Experiments suggest that the proposed model is the only LRNN capable of performing length extrapolation on regular language tasks such as Sum, Even Pair, and Modular Arithmetic. The code is released at \url{//github.com/tinghanf/RegluarLRNN}.
The advancement of wireless communication systems toward 5G and beyond is spurred by the demand for high data rates, exceedingly dependable low-latency communication, and extensive connectivity that aligns with sensing requisites such as advanced high-resolution sensing and target detection. Consequently, embedding sensing into communication has gained considerable attention. In this work, we propose an alternative approach for optimizing integrated sensing and communication (ISAC) waveform for target detection by concurrently maximizing the power of the communication signal at an intended user and minimizing the multi-user and sensing interference. We formulate the problem as a non-disciplined convex programming (NDCP) optimization and we use a distribution-based approach for interference cancellation. Precisely, we establish the distribution of the communication signal and the multi-user communication interference received by the intended user, and thereafter, we establish that the sensing interference can be distributed as a centralized Chi-squared if the sensing covariance matrix is idempotent. We design such a matrix based on the symmetrical idempotent property. Additionally, we propose a disciplined convex programming (DCP) form of the problem, and using successive convex approximation (SCA), we show that the solutions can reach a stable waveform for efficient target detection. Furthermore, we compare the proposed waveform with state of the art radar-communication waveform designs and demonstrate its superior performance by computer simulations.
In a feedforward network, Transfer Entropy (TE) can be used to measure the influence that one layer has on another by quantifying the information transfer between them during training. According to the Information Bottleneck principle, a neural model's internal representation should compress the input data as much as possible while still retaining sufficient information about the output. Information Plane analysis is a visualization technique used to understand the trade-off between compression and information preservation in the context of the Information Bottleneck method by plotting the amount of information in the input data against the compressed representation. The claim that there is a causal link between information-theoretic compression and generalization, measured by mutual information, is plausible, but results from different studies are conflicting. In contrast to mutual information, TE can capture temporal relationships between variables. To explore such links, in our novel approach we use TE to quantify information transfer between neural layers and perform Information Plane analysis. We obtained encouraging experimental results, opening the possibility for further investigations.
Satellite-terrestrial integrated networks (STINs) are promising architecture for providing global coverage. In STINs, full frequency reuse between a satellite and a terrestrial base station (BS) is encouraged for aggressive spectrum reuse, which induces non-negligible amount of interference. To address the interference management problem in STINs, this paper proposes a novel distributed precoding method. Key features of our method are: i) a rate-splitting (RS) strategy is incorporated for efficient interference management and ii) the precoders are designed in a distributed way without sharing channel state information between a satellite and a terrestrial BS. Specifically, to design the precoders in a distributed fashion, we put forth a spectral efficiency decoupling technique, that disentangles the total spectral efficiency function into two distinct terms, each of which is dependent solely on the satellite's precoder and the terrestrial BS's precoder, respectively. Then, to resolve the non-smoothness raised by the RS strategy, we approximate the spectral efficiency expression as a smooth function by using the LogSumExp technique; thereafter we develop a generalized power iteration inspired optimization algorithm built based on the first-order optimality condition. Simulation results demonstrate that the proposed method offers considerable spectral efficiency gains compared to the existing methods.
In this paper, we propose a new six-dimensional (6D) movable antenna (6DMA) system for future wireless networks to improve the communication performance. Unlike the traditional fixed-position antenna (FPA) and existing fluid antenna/two-dimensional (2D) movable antenna (FA/2DMA) systems that adjust the positions of antennas only, the proposed 6DMA system consists of distributed antenna surfaces with independently adjustable three-dimensional (3D) positions as well as 3D rotations within a given space. In particular, this paper applies the 6DMA to the base station (BS) in wireless networks to provide full degrees of freedom (DoFs) for the BS to adapt to the dynamic user spatial distribution in the network. However, a challenging new problem arises on how to optimally control the 6D positions and rotations of all 6DMA surfaces at the BS to maximize the network capacity based on the user spatial distribution, subject to the practical constraints on 6D antennas' movement. To tackle this problem, we first model the 6DMA-enabled BS and the user channels with the BS in terms of 6D positions and rotations of all 6DMA surfaces. Next, we propose an efficient alternating optimization algorithm to search for the best 6D positions and rotations of all 6DMA surfaces by leveraging the Monte Carlo simulation technique. Specifically, we sequentially optimize the 3D position/3D rotation of each 6DMA surface with those of the other surfaces fixed in an iterative manner. Numerical results show that our proposed 6DMA-BS can significantly improve the network capacity as compared to the benchmark BS architectures with FPAs or 6DMAs with limited/partial movability, especially when the user distribution is more spatially non-uniform.
Within the realm of rapidly advancing wireless sensor networks (WSNs), distributed detection assumes a significant role in various practical applications. However, critical challenge lies in maintaining robust detection performance while operating within the constraints of limited bandwidth and energy resources. This paper introduces a novel approach that combines model-driven deep learning (DL) with binary quantization to strike a balance between communication overhead and detection performance in WSNs. We begin by establishing the lower bound of detection error probability for distributed detection using the maximum a posteriori (MAP) criterion. Furthermore, we prove the global optimality of employing identical local quantizers across sensors, thereby maximizing the corresponding Chernoff information. Subsequently, the paper derives the minimum MAP detection error probability (MAPDEP) by inplementing identical binary probabilistic quantizers across the sensors. Moreover, the paper establishes the equivalence between utilizing all quantized data and their average as input to the detector at the fusion center (FC). In particular, we derive the Kullback-Leibler (KL) divergence, which measures the difference between the true posterior probability and output of the proposed detector. Leveraging the MAPDEP and KL divergence as loss functions, the paper proposes model-driven DL method to separately train the probability controller module in the quantizer and the detector module at the FC. Numerical results validate the convergence and effectiveness of the proposed method, which achieves near-optimal performance with reduced complexity for Gaussian hypothesis testing.
As the evolution of wireless communication progresses towards 6G networks, extreme bandwidth communication (EBC) emerges as a key enabler to meet the ambitious key performance indicator set for this next-generation technology. 6G aims for peak data rates of 1 Tb/s, peak spectral efficiency of 60 b/s/Hz, maximum bandwidth of 100 GHz, and mobility support up to 1000 km/h, while maintaining a high level of security. The capability of 6G to manage enormous data volumes introduces heightened security vulnerabilities, such as jamming attacks, highlighting the critical need for in-depth research into jamming in EBC. Understanding these attacks is vital for developing robust countermeasures, ensuring 6G networks can maintain their integrity and reliability amidst these advanced threats. Recognizing the paramount importance of security in 6G applications, this survey paper explores prevalent jamming attacks and the corresponding countermeasures in EBC technologies such as millimeter wave, terahertz, free-space optical, and visible light communications. By comprehensively reviewing the literature on jamming in EBC, this survey paper aims to provide a valuable resource for researchers, engineers, and policymakers involved in the development and deployment of 6G networks. Understanding the nuances of jamming in different EBC technologies is essential for devising robust security mechanisms and ensuring the success of 6G communication systems in the face of emerging threats.
Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.