In this work, we consider a remote monitoring scenario in which multiple sensors share a wireless channel to deliver their status updates to a process monitor via an access point (AP). Moreover, we consider that the sensors randomly arrive and depart from the network as they become active and inactive. The goal of the sensors is to devise a medium access strategy to collectively minimize the long-term mean network \ac{AoI} of their respective processes at the remote monitor. For this purpose, we propose specific modifications to ALOHA-QT algorithm, a distributed medium access algorithm that employs a policy tree (PT) and reinforcement learning (RL) to achieve high throughput. We provide the upper bound on the mean network Age of Information (AoI) for the proposed algorithm along with pointers for selecting its key parameter. The results reveal that the proposed algorithm reduces mean network \ac{AoI} by more than 50 percent for state of the art stationary randomized policies while successfully adjusting to a changing number of active users in the network. The algorithm needs less memory and computation than ALOHA-QT while performing better in terms of AoI.
Emerging distributed cloud architectures, e.g., fog and mobile edge computing, are playing an increasingly important role in the efficient delivery of real-time stream-processing applications such as augmented reality, multiplayer gaming, and industrial automation. While such applications require processed streams to be shared and simultaneously consumed by multiple users/devices, existing technologies lack efficient mechanisms to deal with their inherent multicast nature, leading to unnecessary traffic redundancy and network congestion. In this paper, we establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed execution of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (with respect to its unicast counterpart). We then design a novel queuing system that allows scheduling data packets according to their current destination sets, and leverage Lyapunov drift-plus-penalty theory to develop the first fully decentralized, throughput- and cost-optimal algorithm for multicast cloud network flow control. Numerical experiments validate analytical results and demonstrate the performance gain of the proposed design over existing cloud network control techniques.
We apply a reinforcement meta-learning framework to optimize an integrated and adaptive guidance and flight control system for an air-to-air missile. The system is implemented as a policy that maps navigation system outputs directly to commanded rates of change for the missile's control surface deflections. The system induces intercept trajectories against a maneuvering target that satisfy control constraints on fin deflection angles, and path constraints on look angle and load. We test the optimized system in a six degrees-of-freedom simulator that includes a non-linear radome model and a strapdown seeker model, and demonstrate that the system adapts to both a large flight envelope and off-nominal flight conditions including perturbation of aerodynamic coefficient parameters and center of pressure locations, and flexible body dynamics. Moreover, we find that the system is robust to the parasitic attitude loop induced by radome refraction and imperfect seeker stabilization. We compare our system's performance to a longitudinal model of proportional navigation coupled with a three loop autopilot, and find that our system outperforms this benchmark by a large margin. Additional experiments investigate the impact of removing the recurrent layer from the policy and value function networks, performance with an infrared seeker, and flexible body dynamics.
The naive importance sampling (IS) estimator generally does not work well in examples involving simultaneous inference on several targets, as the importance weights can take arbitrarily large values, making the estimator highly unstable. In such situations, alternative multiple IS estimators involving samples from multiple proposal distributions are preferred. Just like the naive IS, the success of these multiple IS estimators crucially depends on the choice of the proposal distributions. The selection of these proposal distributions is the focus of this article. We propose three methods: (i) a geometric space filling approach, (ii) a minimax variance approach, and (iii) a maximum entropy approach. The first two methods are applicable to any IS estimator, whereas the third approach is described in the context of Doss's (2010) two-stage IS estimator. For the first method, we propose a suitable measure of 'closeness' based on the symmetric Kullback-Leibler divergence, while the second and third approaches use estimates of asymptotic variances of Doss's (2010) IS estimator and Geyer's (1994) reverse logistic regression estimator, respectively. Thus, when samples from the proposal distributions are obtained by running Markov chains, we provide consistent spectral variance estimators for these asymptotic variances. The proposed methods for selecting proposal densities are illustrated using various detailed examples.
The concept of federated learning (FL) was first proposed by Google in 2016. Thereafter, FL has been widely studied for the feasibility of application in various fields due to its potential to make full use of data without compromising the privacy. However, limited by the capacity of wireless data transmission, the employment of federated learning on mobile devices has been making slow progress in practical. The development and commercialization of the 5th generation (5G) mobile networks has shed some light on this. In this paper, we analyze the challenges of existing federated learning schemes for mobile devices and propose a novel cross-device federated learning framework, which utilizes the anonymous communication technology and ring signature to protect the privacy of participants while reducing the computation overhead of mobile devices participating in FL. In addition, our scheme implements a contribution-based incentive mechanism to encourage mobile users to participate in FL. We also give a case study of autonomous driving. Finally, we present the performance evaluation of the proposed scheme and discuss some open issues in federated learning.
Split learning (SL) is a collaborative learning framework, which can train an artificial intelligence (AI) model between a device and an edge server by splitting the AI model into a device-side model and a server-side model at a cut layer. The existing SL approach conducts the training process sequentially across devices, which incurs significant training latency especially when the number of devices is large. In this paper, we design a novel SL scheme to reduce the training latency, named Cluster-based Parallel SL (CPSL) which conducts model training in a "first-parallel-then-sequential" manner. Specifically, the CPSL is to partition devices into several clusters, parallelly train device-side models in each cluster and aggregate them, and then sequentially train the whole AI model across clusters, thereby parallelizing the training process and reducing training latency. Furthermore, we propose a resource management algorithm to minimize the training latency of CPSL considering device heterogeneity and network dynamics in wireless networks. This is achieved by stochastically optimizing the cut layer selection, real-time device clustering, and radio spectrum allocation. The proposed two-timescale algorithm can jointly make the cut layer selection decision in a large timescale and device clustering and radio spectrum allocation decisions in a small timescale. Extensive simulation results on non-independent and identically distributed data demonstrate that the proposed solutions can greatly reduce the training latency as compared with the existing SL benchmarks, while adapting to network dynamics.
Federated learning (FL) has been recognized as a viable distributed learning paradigm which trains a machine learning model collaboratively with massive mobile devices in the wireless edge while protecting user privacy. Although various communication schemes have been proposed to expedite the FL process, most of them have assumed ideal wireless channels which provide reliable and lossless communication links between the server and mobile clients. Unfortunately, in practical systems with limited radio resources such as constraint on the training latency and constraints on the transmission power and bandwidth, transmission of a large number of model parameters inevitably suffers from quantization errors (QE) and transmission outage (TO). In this paper, we consider such non-ideal wireless channels, and carry out the first analysis showing that the FL convergence can be severely jeopardized by TO and QE, but intriguingly can be alleviated if the clients have uniform outage probabilities. These insightful results motivate us to propose a robust FL scheme, named FedTOE, which performs joint allocation of wireless resources and quantization bits across the clients to minimize the QE while making the clients have the same TO probability. Extensive experimental results are presented to show the superior performance of FedTOE for deep learning-based classification tasks with transmission latency constraints.
In large scale dynamic wireless networks, the amount of overhead caused by channel estimation (CE) is becoming one of the main performance bottlenecks. This is due to the large number users whose channels should be estimated, the user mobility, and the rapid channel change caused by the usage of the high-frequency spectrum (e.g. millimeter wave). In this work, we propose a new hybrid channel estimation/prediction (CEP) scheme to reduce overhead in time-division duplex (TDD) wireless cell-free massive multiple-input-multiple-output (mMIMO) systems. The scheme proposes sending a pilot signal from each user only once in a given number (window) of coherence intervals (CIs). Then minimum mean-square error (MMSE) estimation is used to estimate the channel of this CI, while a deep neural network (DNN) is used to predict the channels of the remaining CIs in the window. The DNN exploits the temporal correlation between the consecutive CIs and the received pilot signals to improve the channel prediction accuracy. By doing so, CE overhead is reduced by at least 50 percent at the expense of negligible CE error for practical user mobility settings. Consequently, the proposed CEP scheme improves the spectral efficiency compared to the conventional MMSE CE approach, especially when the number of users is large, which is demonstrated numerically.
We demonstrate that merely analog transmissions and match filtering can realize the function of an edge server in federated learning (FL). Therefore, a network with massively distributed user equipments (UEs) can achieve large-scale FL without an edge server. We also develop a training algorithm that allows UEs to continuously perform local computing without being interrupted by the global parameter uploading, which exploits the full potential of UEs' processing power. We derive convergence rates for the proposed schemes to quantify their training efficiency. The analyses reveal that when the interference obeys a Gaussian distribution, the proposed algorithm retrieves the convergence rate of a server-based FL. But if the interference distribution is heavy-tailed, then the heavier the tail, the slower the algorithm converges. Nonetheless, the system run time can be largely reduced by enabling computation in parallel with communication, whereas the gain is particularly pronounced when communication latency is high. These findings are corroborated via excessive simulations.
With the increasing penetration of distributed energy resources, distributed optimization algorithms have attracted significant attention for power systems applications due to their potential for superior scalability, privacy, and robustness to a single point-of-failure. The Alternating Direction Method of Multipliers (ADMM) is a popular distributed optimization algorithm; however, its convergence performance is highly dependent on the selection of penalty parameters, which are usually chosen heuristically. In this work, we use reinforcement learning (RL) to develop an adaptive penalty parameter selection policy for the AC optimal power flow (ACOPF) problem solved via ADMM with the goal of minimizing the number of iterations until convergence. We train our RL policy using deep Q-learning, and show that this policy can result in significantly accelerated convergence (up to a 59% reduction in the number of iterations compared to existing, curvature-informed penalty parameter selection methods). Furthermore, we show that our RL policy demonstrates promise for generalizability, performing well under unseen loading schemes as well as under unseen losses of lines and generators (up to a 50% reduction in iterations). This work thus provides a proof-of-concept for using RL for parameter selection in ADMM for power systems applications.
The Evidential regression network (ENet) estimates a continuous target and its predictive uncertainty without costly Bayesian model averaging. However, it is possible that the target is inaccurately predicted due to the gradient shrinkage problem of the original loss function of the ENet, the negative log marginal likelihood (NLL) loss. In this paper, the objective is to improve the prediction accuracy of the ENet while maintaining its efficient uncertainty estimation by resolving the gradient shrinkage problem. A multi-task learning (MTL) framework, referred to as MT-ENet, is proposed to accomplish this aim. In the MTL, we define the Lipschitz modified mean squared error (MSE) loss function as another loss and add it to the existing NLL loss. The Lipschitz modified MSE loss is designed to mitigate the gradient conflict with the NLL loss by dynamically adjusting its Lipschitz constant. By doing so, the Lipschitz MSE loss does not disturb the uncertainty estimation of the NLL loss. The MT-ENet enhances the predictive accuracy of the ENet without losing uncertainty estimation capability on the synthetic dataset and real-world benchmarks, including drug-target affinity (DTA) regression. Furthermore, the MT-ENet shows remarkable calibration and out-of-distribution detection capability on the DTA benchmarks.