There is a continuous growth in demand for time sensitive applications which has shifted the cloud paradigm from a centralized computing architecture towards distributed heterogeneous computing platforms where resources located at the edge of the network are used to provide cloud-like services. This paradigm is widely known as fog computing. Virtual machines (VMs) have been widely utilized in both paradigms to enhance the network scalability, improve resource utilization, and energy efficiency. Moreover, Passive Optical Networks (PONs) are a technology suited to handling the enormous volumes of data generated in the access network due to their energy efficiency and large bandwidth. In this paper, we utilize a PON to provide the connectivity between multiple distributed fog units to achieve federated (i.e. cooperative) computing units in the access network to serve intensive demands. We propose a mixed integer linear program (MILP) to optimize the VM placement in the federated fog computing units with the objective of minimizing the total power consumption while considering inter-VM traffic. The results show a significant power saving as a result of the proposed optimization model by up to 52%, in the VM-allocation compared to a baseline approach that allocates the VM requests while neglecting the power consumption and inter-VMs traffic in the optimization framework.
Integrated sensing and communication enables sensing capability for wireless networks. However, the interference management and resource allocation between sensing and communication have not been fully studied. In this paper, we consider the design of perceptive mobile networks (PMNs) by adding sensing capability to current cellular networks. To avoid the full-duplex operation and reduce interference, we propose the PMN with distributed target monitoring terminals (TMTs) where passive TMTs are deployed over wireless networks to locate the sensing target (ST). We then jointly optimize the transmit and receive beamformers towards the communication user terminals (UEs) and the ST by alternating-optimization (AO) and prove its convergence. To reduce computation complexity and obtain physical insights, we further investigate the use of linear transceivers, including zero forcing and beam synthesis (B-syn), and show that B-syn can achieve comparable sensing performance as AO especially when the communication requirement is high. Some interesting physical insights are also revealed. For example, instead of forming a dedicated sensing signal, it is more efficient to jointly design the communication signals for different UEs such that they ``collaboratively leak" energy to the ST. Furthermore, the amount of energy leakage from one UE to the ST depends on their relative locations.
Training a machine learning model with federated edge learning (FEEL) is typically time-consuming due to the constrained computation power of edge devices and limited wireless resources in edge networks. In this paper, the training time minimization problem is investigated in a quantized FEEL system, where the heterogeneous edge devices send quantized gradients to the edge server via orthogonal channels. In particular, a stochastic quantization scheme is adopted for compression of uploaded gradients, which can reduce the burden of per-round communication but may come at the cost of increasing number of communication rounds. The training time is modeled by taking into account the communication time, computation time and the number of communication rounds. Based on the proposed training time model, the intrinsic trade-off between the number of communication rounds and per-round latency is characterized. Specifically, we analyze the convergence behavior of the quantized FEEL in terms of the optimality gap. Further, a joint data-and-model-driven fitting method is proposed to obtain the exact optimality gap, based on which the closed-form expressions for the number of communication rounds and the total training time are obtained. Constrained by total bandwidth, the training time minimization problem is formulated as a joint quantization level and bandwidth allocation optimization problem. To this end, an algorithm based on alternating optimization is proposed, which alternatively solves the subproblem of quantization optimization via successive convex approximation and the subproblem of bandwidth allocation via bisection search. With different learning tasks and models, the validation of our analysis and the near-optimal performance of the proposed optimization algorithm are demonstrated by the experimental results.
Vehicular edge computing (VEC) is envisioned as a promising approach to process the explosive computation tasks of vehicular user (VU). In the VEC system, each VU allocates power to process partial tasks through offloading and the remaining tasks through local execution. During the offloading, each VU adopts the multi-input multi-out and non-orthogonal multiple access (MIMO-NOMA) channel to improve the channel spectrum efficiency and capacity. However, the channel condition is uncertain due to the channel interference among VUs caused by the MIMO-NOMA channel and the time-varying path-loss caused by the mobility of each VU. In addition, the task arrival of each VU is stochastic in the real world. The stochastic task arrival and uncertain channel condition affect greatly on the power consumption and latency of tasks for each VU. It is critical to design an optimal power allocation scheme considering the stochastic task arrival and channel variation to optimize the long-term reward including the power consumption and latency in the MIMO-NOMA VEC. Different from the traditional centralized deep reinforcement learning (DRL)-based scheme, this paper constructs a decentralized DRL framework to formulate the power allocation optimization problem, where the local observations are selected as the state. The deep deterministic policy gradient (DDPG) algorithm is adopted to learn the optimal power allocation scheme based on the decentralized DRL framework. Simulation results demonstrate that our proposed power allocation scheme outperforms the existing schemes.
We initiate a systematic study on $\mathit{dynamic}$ $\mathit{influence}$ $\mathit{maximization}$ (DIM). In the DIM problem, one maintains a seed set $S$ of at most $k$ nodes in a dynamically involving social network, with the goal of maximizing the expected influence spread while minimizing the amortized updating cost. We consider two evolution models. In the $\mathit{incremental}$ model, the social network gets enlarged over time and one only introduces new users and establishes new social links, we design an algorithm that achieves $(1-1/e-\epsilon)$-approximation to the optimal solution and has $k \cdot\mathsf{poly}(\log n, \epsilon^{-1})$ amortized running time, which matches the state-of-art offline algorithm with only poly-logarithmic overhead. In the $\mathit{fully}$ $\mathit{dynamic}$ model, users join in and leave, influence propagation gets strengthened or weakened in real time, we prove that under the Strong Exponential Time Hypothesis (SETH), no algorithm can achieve $2^{-(\log n)^{1-o(1)}}$-approximation unless the amortized running time is $n^{1-o(1)}$. On the technical side, we exploit novel adaptive sampling approaches that reduce DIM to the dynamic MAX-k coverage problem, and design an efficient $(1-1/e-\epsilon)$-approximation algorithm for it. Our lower bound leverages the recent developed distributed PCP framework.
Federated learning (FL) has emerged as a popular methodology for distributing machine learning across wireless edge devices. In this work, we consider optimizing the tradeoff between model performance and resource utilization in FL, under device-server communication delays and device computation heterogeneity. Our proposed StoFedDelAv algorithm incorporates a local-global model combiner into the FL synchronization step. We theoretically characterize the convergence behavior of StoFedDelAv and obtain the optimal combiner weights, which consider the global model delay and expected local gradient error at each device. We then formulate a network-aware optimization problem which tunes the minibatch sizes of the devices to jointly minimize energy consumption and machine learning training loss, and solve the non-convex problem through a series of convex approximations. Our simulations reveal that StoFedDelAv outperforms the current art in FL in terms of model convergence speed and network resource utilization when the minibatch size and the combiner weights are adjusted. Additionally, our method can reduce the number of uplink communication rounds required during the model training period to reach the same accuracy.
With growing deployment of Internet of Things (IoT) and machine learning (ML) applications, which need to leverage computation on edge and cloud resources, it is important to develop algorithms and tools to place these distributed computations to optimize their performance. We address the problem of optimally placing computations (described as directed acyclic graphs (DAGs)) on a set of machines to maximize the steady-state throughput for pipelined inputs. Traditionally, such optimization has focused on a different metric, minimizing single-shot makespan, and a well-known algorithm is the Heterogeneous Earliest Finish Time (HEFT) algorithm. Maximizing throughput however, is more suitable for many real-time, edge, cloud and IoT applications, we present a different scheduling algorithm, namely Throughput HEFT (TPHEFT). Further, we present two throughput-oriented enhancements which can be applied to any baseline schedule, that we refer to as "node splitting" (SPLIT) and "task duplication" (DUP). In order to implement and evaluate these algorithms, we built new subsystems and plugins for an open-source dispersed computing framework called Jupiter. Experiments with varying DAG structures indicate that: 1) TPHEFT can significantly improve throughput performance compared to HEFT (up to 2.3 times in our experiments), with greater gains when there is less degree of parallelism in the DAG, 2) Node splitting can potentially improve performance over a baseline schedule, with greater gains when there's an imbalanced allocation of computation or inter-task communication, and 3) Task duplication generally gives improvements only when running upon a baseline that places communication over slow links. To our knowledge, this is the first study to present a systematic experimental implementation and exploration of throughput-enhancing techniques for dispersed computing on real testbeds.
Federated learning has been showing as a promising approach in paving the last mile of artificial intelligence, due to its great potential of solving the data isolation problem in large scale machine learning. Particularly, with consideration of the heterogeneity in practical edge computing systems, asynchronous edge-cloud collaboration based federated learning can further improve the learning efficiency by significantly reducing the straggler effect. Despite no raw data sharing, the open architecture and extensive collaborations of asynchronous federated learning (AFL) still give some malicious participants great opportunities to infer other parties' training data, thus leading to serious concerns of privacy. To achieve a rigorous privacy guarantee with high utility, we investigate to secure asynchronous edge-cloud collaborative federated learning with differential privacy, focusing on the impacts of differential privacy on model convergence of AFL. Formally, we give the first analysis on the model convergence of AFL under DP and propose a multi-stage adjustable private algorithm (MAPA) to improve the trade-off between model utility and privacy by dynamically adjusting both the noise scale and the learning rate. Through extensive simulations and real-world experiments with an edge-could testbed, we demonstrate that MAPA significantly improves both the model accuracy and convergence speed with sufficient privacy guarantee.
This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
Network Virtualization is one of the most promising technologies for future networking and considered as a critical IT resource that connects distributed, virtualized Cloud Computing services and different components such as storage, servers and application. Network Virtualization allows multiple virtual networks to coexist on same shared physical infrastructure simultaneously. One of the crucial keys in Network Virtualization is Virtual Network Embedding, which provides a method to allocate physical substrate resources to virtual network requests. In this paper, we investigate Virtual Network Embedding strategies and related issues for resource allocation of an Internet Provider(InP) to efficiently embed virtual networks that are requested by Virtual Network Operators(VNOs) who share the same infrastructure provided by the InP. In order to achieve that goal, we design a heuristic Virtual Network Embedding algorithm that simultaneously embeds virtual nodes and virtual links of each virtual network request onto physic infrastructure. Through extensive simulations, we demonstrate that our proposed scheme improves significantly the performance of Virtual Network Embedding by enhancing the long-term average revenue as well as acceptance ratio and resource utilization of virtual network requests compared to prior algorithms.