Recent years have witnessed the emergence of mobile edge computing (MEC), on the premise of a cost-effective enhancement in the computational ability of hardware-constrained wireless devices (WDs) comprising the Internet of Things (IoT). In a general multi-server multi-user MEC system, each WD has a computational task to execute and has to select binary (off)loading decisions, along with the analog-amplitude resource allocation variables in an online manner, with the goal of minimizing the overall energy-delay cost (EDC) with dynamic system states. While past works typically rely on the explicit expression of the EDC function, the present contribution considers a practical setting, where in lieu of system state information, the EDC function is not available in analytical form, and instead only the function values at queried points are revealed. Towards tackling such a challenging online combinatorial problem with only bandit information, novel Bayesian optimization (BO) based approaches are put forth by leveraging the multi-armed bandit (MAB) framework. Per time slot, the discrete offloading decisions are first obtained via the MAB method, and the analog resource allocation variables are subsequently optimized using the BO selection rule. By exploiting both temporal and contextual information, two novel BO approaches, termed time-varying BO and contextual time-varying BO, are developed. Numerical tests validate the merits of the proposed BO approaches compared with contemporary benchmarks under different MEC network sizes.
Optimal resource allocation in modern communication networks calls for the optimization of objective functions that are only accessible via costly separate evaluations for each candidate solution. The conventional approach carries out the optimization of resource-allocation parameters for each system configuration, characterized, e.g., by topology and traffic statistics, using global search methods such as Bayesian optimization (BO). These methods tend to require a large number of iterations, and hence a large number of key performance indicator (KPI) evaluations. In this paper, we propose the use of meta-learning to transfer knowledge from data collected from related, but distinct, configurations in order to speed up optimization on new network configurations. Specifically, we combine meta-learning with BO, as well as with multi-armed bandit (MAB) optimization, with the latter having the potential advantage of operating directly on a discrete search space. Furthermore, we introduce novel contextual meta-BO and meta-MAB algorithms, in which transfer of knowledge across configurations occurs at the level of a mapping from graph-based contextual information to resource-allocation parameters. Experiments for the problem of open loop power control (OLPC) parameter optimization for the uplink of multi-cell multi-antenna systems provide insights into the potential benefits of meta-learning and contextual optimization.
The coordination of robotic swarms and the remote wireless control of industrial systems are among the major use cases for 5G and beyond systems: in these cases, the massive amounts of sensory information that needs to be shared over the wireless medium can overload even high-capacity connections. Consequently, solving the effective communication problem by optimizing the transmission strategy to discard irrelevant information can provide a significant advantage, but is often a very complex task. In this work, we consider a prototypal system in which an observer must communicate its sensory data to an actor controlling a task (e.g., a mobile robot in a factory). We then model it as a remote Partially Observable Markov Decision Process (POMDP), considering the effect of adopting semantic and effective communication-oriented solutions on the overall system performance. We split the communication problem by considering an ensemble Vector Quantized Variational Autoencoder (VQ-VAE) encoding, and train a Deep Reinforcement Learning (DRL) agent to dynamically adapt the quantization level, considering both the current state of the environment and the memory of past messages. We tested the proposed approach on the well-known CartPole reference control problem, obtaining a significant performance increase over traditional approaches
Polynomial chaos expansion (PCE) is a versatile tool widely used in uncertainty quantification and machine learning, but its successful application depends strongly on the accuracy and reliability of the resulting PCE-based response surface. High accuracy typically requires high polynomial degrees, demanding many training points especially in high-dimensional problems through the curse of dimensionality. So-called sparse PCE concepts work with a much smaller selection of basis polynomials compared to conventional PCE approaches and can overcome the curse of dimensionality very efficiently, but have to pay specific attention to their strategies of choosing training points. Furthermore, the approximation error resembles an uncertainty that most existing PCE-based methods do not estimate. In this study, we develop and evaluate a fully Bayesian approach to establish the PCE representation via joint shrinkage priors and Markov chain Monte Carlo. The suggested Bayesian PCE model directly aims to solve the two challenges named above: achieving a sparse PCE representation and estimating uncertainty of the PCE itself. The embedded Bayesian regularizing via the joint shrinkage prior allows using higher polynomial degrees for given training points due to its ability to handle underdetermined situations, where the number of considered PCE coefficients could be much larger than the number of available training points. We also explore multiple variable selection methods to construct sparse PCE expansions based on the established Bayesian representations, while globally selecting the most meaningful orthonormal polynomials given the available training data. We demonstrate the advantages of our Bayesian PCE and the corresponding sparsity-inducing methods on several benchmarks.
This paper analyzes wireless network control for remote estimation of linear time-invariant dynamical systems under various Hybrid Automatic Repeat Request (HARQ) packet retransmission schemes. In conventional HARQ, packet reliability increases gradually with additional packets; however, each retransmission maximally increases the Age of Information and causes severe degradation in estimation mean squared error (MSE) performance. We optimize standard HARQ schemes by allowing partial retransmissions to increase the packet reliability gradually and limit the AoI growth. In incremental redundancy HARQ, we optimize the retransmission time to enable the early arrival of the next status updates. In Chase combining HARQ, since packet length remains fixed, we allow retransmission and new updates in a single time slot using non-orthogonal signaling. Non-orthogonal retransmissions increase packet reliability without delaying the fresh updates. We formulate bi-objective optimization with the proposed variance of the MSE-based cost function and standard long-term average MSE cost function to guarantee short-term performance stability. Using the Markov decision process formulation, we find the optimal static and dynamic policies under the proposed HARQ schemes to improve MSE performance further. The simulation results show that the proposed HARQ-based policies are more robust and achieve significantly better and more stable MSE performance than standard HARQ-based policies.
This work proposes a novel model-free Reinforcement Learning (RL) agent that is able to learn how to complete an unknown task having access to only a part of the input observation. We take inspiration from the concepts of visual attention and active perception that are characteristic of humans and tried to apply them to our agent, creating a hard attention mechanism. In this mechanism, the model decides first which region of the input image it should look at, and only after that it has access to the pixels of that region. Current RL agents do not follow this principle and we have not seen these mechanisms applied to the same purpose as this work. In our architecture, we adapt an existing model called recurrent attention model (RAM) and combine it with the proximal policy optimization (PPO) algorithm. We investigate whether a model with these characteristics is capable of achieving similar performance to state-of-the-art model-free RL agents that access the full input observation. This analysis is made in two Atari games, Pong and SpaceInvaders, which have a discrete action space, and in CarRacing, which has a continuous action space. Besides assessing its performance, we also analyze the movement of the attention of our model and compare it with what would be an example of the human behavior. Even with such visual limitation, we show that our model matches the performance of PPO+LSTM in two of the three games tested.
We consider in discrete time, a general class of sequential stochastic dynamic games with asymmetric information with the following features. The underlying system has Markovian dynamics controlled by the agents' joint actions. Each agent's instantaneous utility depends on the current system state and the agents' joint actions. At each time instant each agent makes a private noisy observation of the current system state and the agents' actions in the previous time instant. In addition, at each time instant all agents have a common noisy observation of the current system state and their actions in the previous time instant. Each agent's actions are part of his private information. The objective is to determine Bayesian Nash Equilibrium (BNE) strategy profiles that are based on a compressed version of the agents' information and can be sequentially computed; such BNE strategy profiles may not always exist. We present an approach/methodology that achieves the above-stated objective, along with an instance of a game where BNE strategy profiles with the above-mentioned characteristics exist. We show that the methodology also works for the case where the agents have no common observations.
The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.
This manuscript portrays optimization as a process. In many practical applications the environment is so complex that it is infeasible to lay out a comprehensive theoretical model and use classical algorithmic theory and mathematical optimization. It is necessary as well as beneficial to take a robust approach, by applying an optimization method that learns as one goes along, learning from experience as more aspects of the problem are observed. This view of optimization as a process has become prominent in varied fields and has led to some spectacular success in modeling and systems that are now part of our daily lives.
Dynamic neural network is an emerging research topic in deep learning. Compared to static models which have fixed computational graphs and parameters at the inference stage, dynamic networks can adapt their structures or parameters to different inputs, leading to notable advantages in terms of accuracy, computational efficiency, adaptiveness, etc. In this survey, we comprehensively review this rapidly developing area by dividing dynamic networks into three main categories: 1) instance-wise dynamic models that process each instance with data-dependent architectures or parameters; 2) spatial-wise dynamic networks that conduct adaptive computation with respect to different spatial locations of image data and 3) temporal-wise dynamic models that perform adaptive inference along the temporal dimension for sequential data such as videos and texts. The important research problems of dynamic networks, e.g., architecture design, decision making scheme, optimization technique and applications, are reviewed systematically. Finally, we discuss the open problems in this field together with interesting future research directions.
Driven by the visions of Internet of Things and 5G communications, the edge computing systems integrate computing, storage and network resources at the edge of the network to provide computing infrastructure, enabling developers to quickly develop and deploy edge applications. Nowadays the edge computing systems have received widespread attention in both industry and academia. To explore new research opportunities and assist users in selecting suitable edge computing systems for specific applications, this survey paper provides a comprehensive overview of the existing edge computing systems and introduces representative projects. A comparison of open source tools is presented according to their applicability. Finally, we highlight energy efficiency and deep learning optimization of edge computing systems. Open issues for analyzing and designing an edge computing system are also studied in this survey.