欧美成人性色XXⅩXXA片在线,亚洲国产欧美一区二区午夜浪,亚洲国产日韩A在线播放性色

Softwarization and virtualization are key concepts for emerging industries that require ultra-low latency. This is only possible if computing resources, traditionally centralized at the core of communication networks, are moved closer to the user, to the network edge. However, the realization of Edge Computing (EC) in the sixth generation (6G) of mobile networks requires efficient resource allocation mechanisms for the placement of the Virtual Network Functions (VNFs). Machine learning (ML) methods, and more specifically, Reinforcement Learning (RL), are a promising approach to solve this problem. The main contributions of this work are twofold: first, we obtain the theoretical performance bound for VNF placement in EC-enabled6G networks by formulating the problem mathematically as a finite Markov Decision Process (MDP) and solving it using a dynamic programming method called Policy Iteration (PI). Second, we develop a practical solution to the problem using RL, where the problem is treated with Q-Learning that considers both computational and communication resources when placing VNFs in the network. The simulation results under different settings of the system parameters show that the performance of the Q-Learning approach is close to the optimal PI algorithm (without having its restrictive assumptions on service statistics). This is particularly interesting when the EC resources are scarce and efficient management of these resources is required.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

CASES · CASE · 樣例 · 設計 · 統計方法 ·

2022 年 1 月 18 日

Fragility Measures For Typical Cases

Benjamin R. Baer,Stephen E. Fremes,Mary Charlson,Mario Gaudino,Martin T. Wells

from arxiv, 30 pages, 3 figures

The fragility index is a clinically motivated metric designed to supplement the $p$ value during hypothesis testing. The measure relies on two pillars: selecting cases to have their outcome modified and modifying the outcomes. The measure is interesting but the case selection suffers from a drawback which can hamper its interpretation. This work presents the drawback and a method, the stochastic generalized fragility indices, designed to remedy it. Two examples concerning electoral outcomes and the causal effect of smoking cessation illustrate the method.

約束優化 · 優化器 · Networks · 電氣電子工程師學會 · 計算統計 ·

2022 年 1 月 17 日

Convergence Analysis of Fixed Point Chance Constrained Optimal Power Flow Problems

Johannes J. Brust,Mihai Anitescu

For optimal power flow problems with chance constraints, a particularly effective method is based on a fixed point iteration applied to a sequence of deterministic power flow problems. However, a priori, the convergence of such an approach is not necessarily guaranteed. This article analyses the convergence conditions for this fixed point approach, and reports numerical experiments including for large IEEE networks.

Performer · 控制器 · Neural Networks · 泛函 · tuning ·

2022 年 1 月 17 日

Optimisation of Structured Neural Controller Based on Continuous-Time Policy Gradient

Namhoon Cho,Hyo-Sang Shin

from arxiv, 19 pages

This study presents a policy optimisation framework for structured nonlinear control of continuous-time (deterministic) dynamic systems. The proposed approach prescribes a structure for the controller based on relevant scientific knowledge (such as Lyapunov stability theory or domain experiences) while considering the tunable elements inside the given structure as the point of parametrisation with neural networks. To optimise a cost represented as a function of the neural network weights, the proposed approach utilises the continuous-time policy gradient method based on adjoint sensitivity analysis as a means for correct and performant computation of cost gradient. This enables combining the stability, robustness, and physical interpretability of an analytically-derived structure for the feedback controller with the representational flexibility and optimised resulting performance provided by machine learning techniques. Such a hybrid paradigm for fixed-structure control synthesis is particularly useful for optimising adaptive nonlinear controllers to achieve improved performance in online operation, an area where the existing theory prevails the design of structure while lacking clear analytical understandings about tuning of the gains and the uncertainty model basis functions that govern the performance characteristics. Numerical experiments on aerospace applications illustrate the utility of the structured nonlinear controller optimisation framework.

跡 · MINE · Processing（編程語言） · 簇 · 優化器 ·

2022 年 1 月 16 日

On the Potential of Execution Traces for Batch Processing Workload Optimization in Public Clouds

Dominik Scheinert,Alireza Alamgiralem,Jonathan Bader,Jonathan Will,Thorsten Wittkopp,Lauritz Thamsen

from arxiv, 6 pages, 5 figures, 1 table

With the growing amount of data, data processing workloads and the management of their resource usage becomes increasingly important. Since managing a dedicated infrastructure is in many situations infeasible or uneconomical, users progressively execute their respective workloads in the cloud. As the configuration of workloads and resources is often challenging, various methods have been proposed that either quickly profile towards a good configuration or determine one based on data from previous runs. Still, performance data to train such methods is often lacking and must be costly collected. In this paper, we propose a collaborative approach for sharing anonymized workload execution traces among users, mining them for general patterns, and exploiting clusters of historical workloads for future optimizations. We evaluate our prototype implementation for mining workload execution graphs on a publicly available trace dataset and demonstrate the predictive value of workload clusters determined through traces only.

欠估計 · 有偏 · 估計/估計量 · Extensibility · 可約的 ·

2022 年 1 月 14 日

On the Estimation Bias in Double Q-Learning

Zhizhou Ren,Guangxiang Zhu,Hao Hu,Beining Han,Jianglun Chen,Chongjie Zhang

from arxiv, Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation. Its variants in the deep Q-learning paradigm have shown great promise in producing reliable value prediction and improving learning performance. However, as shown by prior work, double Q-learning is not fully unbiased and suffers from underestimation bias. In this paper, we show that such underestimation bias may lead to multiple non-optimal fixed points under an approximate Bellman operator. To address the concerns of converging to non-optimal stationary solutions, we propose a simple but effective approach as a partial fix for the underestimation bias in double Q-learning. This approach leverages an approximate dynamic programming to bound the target value. We extensively evaluate our proposed method in the Atari benchmark tasks and demonstrate its significant improvement over baseline algorithms.

跡 · 貪心逐層預訓練 · 貪心 · INFORMS · Processing（編程語言） ·

2022 年 1 月 14 日

A Markov Decision Process Framework for Efficient and Implementable Contact Tracing and Isolation

George Li,Arash Haddadan,Ann Li,Madhav Marathe,Aravind Srinivasan,Anil Vullikanti,Zeyu Zhao

from arxiv, 24 pages, 11 figures, to be published in AAMAS-22 as a 2 page extended abstract. This version has additional fairness guarantees, and fixes some typos

Efficient contact tracing and isolation is an effective strategy to control epidemics. It was used effectively during the Ebola epidemic and successfully implemented in several parts of the world during the ongoing COVID-19 pandemic. An important consideration in contact tracing is the budget on the number of individuals asked to quarantine -- the budget is limited for socioeconomic reasons. In this paper, we present a Markov Decision Process (MDP) framework to formulate the problem of using contact tracing to reduce the size of an outbreak while asking a limited number of people to quarantine. We formulate each step of the MDP as a combinatorial problem, MinExposed, which we demonstrate is NP-Hard; as a result, we develop an LP-based approximation algorithm. Though this algorithm directly solves MinExposed, it is often impractical in the real world due to information constraints. To this end, we develop a greedy approach based on insights from the analysis of the previous algorithm, which we show is more interpretable. A key feature of the greedy algorithm is that it does not need complete information of the underlying social contact network. This makes the heuristic implementable in practice and is an important consideration. Finally, we carry out experiments on simulations of the MDP run on real-world networks, and show how the algorithms can help in bending the epidemic curve while limiting the number of isolated individuals. Our experimental results demonstrate that the greedy algorithm and its variants are especially effective, robust, and practical in a variety of realistic scenarios, such as when the contact graph and specific transmission probabilities are not known. All code can be found in our GitHub repository: //github.com/gzli929/ContactTracing.

優化器 · 可約的 · 近似 · 控制器 · Principle ·

2020 年 6 月 29 日

Differential Dynamic Programming Neural Optimizer

Guan-Horng Liu,Tianrong Chen,Evangelos A. Theodorou

Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.

tuning · 學成 · 深度強化學習 · 超參數 · Performer ·

2018 年 12 月 26 日

Learning to Walk via Deep Reinforcement Learning

Tuomas Haarnoja,Aurick Zhou,Sehoon Ha,Jie Tan,George Tucker,Sergey Levine

from arxiv, Videos: //sites.google.com/view/minitaur-locomotion/ . arXiv admin note: substantial text overlap with arXiv:1812.05905

Deep reinforcement learning suggests the promise of fully automated learning of robotic control policies that directly map sensory inputs to low-level actions. However, applying deep reinforcement learning methods on real-world robots is exceptionally difficult, due both to the sample complexity and, just as importantly, the sensitivity of such methods to hyperparameters. While hyperparameter tuning can be performed in parallel in simulated domains, it is usually impractical to tune hyperparameters directly on real-world robotic platforms, especially legged platforms like quadrupedal robots that can be damaged through extensive trial-and-error learning. In this paper, we develop a stable variant of the soft actor-critic deep reinforcement learning algorithm that requires minimal hyperparameter tuning, while also requiring only a modest number of trials to learn multilayer neural network policies. This algorithm is based on the framework of maximum entropy reinforcement learning, and automatically trades off exploration against exploitation by dynamically and automatically tuning a temperature parameter that determines the stochasticity of the policy. We show that this method achieves state-of-the-art performance on four standard benchmark environments. We then demonstrate that it can be used to learn quadrupedal locomotion gaits on a real-world Minitaur robot, learning to walk from scratch directly in the real world in two hours of training.

優化器 · Lipschitz連續 · 正則化項 · Continuity · Lipschitz ·

2018 年 6 月 1 日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Kevin Scaman,Francis Bach,Sébastien Bubeck,Yin Tat Lee,Laurent Massoulié

from arxiv, 17 pages

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

Networking · 網絡嵌入 · ReQuEST · 云計算服務相關產品和公司 · Extensibility ·

2018 年 1 月 30 日

Towards Efficient Dynamic Virtual Network Embedding Strategy for Cloud IoT Networks

Duc-Lam Nguyen,HyungHo Byun,Naeon Kim,Chong-Kwon Kim

from arxiv, 12 pages, 10 figures, Preprint submitted to International Journal of Distributed Sensor Networks

Network Virtualization is one of the most promising technologies for future networking and considered as a critical IT resource that connects distributed, virtualized Cloud Computing services and different components such as storage, servers and application. Network Virtualization allows multiple virtual networks to coexist on same shared physical infrastructure simultaneously. One of the crucial keys in Network Virtualization is Virtual Network Embedding, which provides a method to allocate physical substrate resources to virtual network requests. In this paper, we investigate Virtual Network Embedding strategies and related issues for resource allocation of an Internet Provider(InP) to efficiently embed virtual networks that are requested by Virtual Network Operators(VNOs) who share the same infrastructure provided by the InP. In order to achieve that goal, we design a heuristic Virtual Network Embedding algorithm that simultaneously embeds virtual nodes and virtual links of each virtual network request onto physic infrastructure. Through extensive simulations, we demonstrate that our proposed scheme improves significantly the performance of Virtual Network Embedding by enhancing the long-term average revenue as well as acceptance ratio and resource utilization of virtual network requests compared to prior algorithms.