国产免费一区二区三区在线能观看,无码一级毛片免费,国产无遮挡又黄又爽1000部,日本国产一级二级三级不卡

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

We study the joint active/passive beamforming and channel blocklength (CBL) allocation in a non-ideal reconfigurable intelligent surface (RIS)-aided ultra-reliable and low-latency communication (URLLC) system. The considered scenario is a finite blocklength (FBL) regime and the problem is solved by leveraging a novel deep reinforcement learning (DRL) algorithm named twin-delayed deep deterministic policy gradient (TD3). First, assuming an industrial automation system with multiple actuators, the signal-to-interference-plus-noise ratio and achievable rate in the FBL regime are identified for each actuator in terms of the phase shift configuration matrix at the RIS. Next, the joint active/passive beamforming and CBL optimization problem is formulated where the objective is to maximize the total achievable FBL rate in all actuators, subject to non-linear amplitude response at the RIS elements, BS transmit power budget, and total available CBL. Since the amplitude response equality constraint is highly non-convex and non-linear, we resort to employing an actor-critic policy gradient DRL algorithm based on TD3. The considered method relies on interacting RIS with the industrial automation environment by taking actions which are the phase shifts at the RIS elements, CBL variables, and BS beamforming to maximize the expected observed reward, i.e., the total FBL rate. We assess the performance loss of the system when the RIS is non-ideal, i.e., with non-linear amplitude response, and compare it with ideal RIS without impairments. The numerical results show that optimizing the RIS phase shifts, BS beamforming, and CBL variables via the proposed TD3 method is highly beneficial to improving the network total FBL rate as the proposed method with deterministic policy outperforms conventional methods.

相關內容

Automator

關注 5

Automator是蘋果公司為他們的Mac OS X系統開發的一款軟件。 只要通過點擊拖拽鼠標等操作就可以將一系列動作組合成一個工作流，從而幫助你自動的（可重復的）完成一些復雜的工作。Automator還能橫跨很多不同種類的程序，包括：查找器、Safari網絡瀏覽器、iCal、地址簿或者其他的一些程序。它還能和一些第三方的程序一起工作，如微軟的Office、Adobe公司的Photoshop或者Pixelmator等。

優化器 · INFORMS · Weight · SCA · 約束 ·

2022 年 6 月 9 日

Beamforming Optimization for Active Intelligent Reflecting Surface-Aided SWIPT

Ying Gao,Qingqing Wu,Guangchi Zhang,Wen Chen,Derrick Wing Kwan Ng,Marco Di Renzo

from arxiv, 32 pages, 10 figures, submitted to IEEE journal for possible publication

In this paper, we study an active IRS-aided simultaneous wireless information and power transfer (SWIPT) system. Specifically, an active IRS is deployed to assist a multi-antenna access point (AP) to convey information and energy simultaneously to multiple single-antenna information users (IUs) and energy users (EUs). Two joint transmit and reflect beamforming optimization problems are investigated with different practical objectives. The first problem maximizes the weighted sum-power harvested by the EUs subject to individual signal-to-interference-plus-noise ratio (SINR) constraints at the IUs, while the second problem maximizes the weighted sum-rate of the IUs subject to individual energy harvesting (EH) constraints at the EUs. The optimization problems are non-convex and difficult to solve optimally. To tackle these two problems, we first rigorously prove that dedicated energy beams are not required for their corresponding semidefinite relaxation (SDR) reformulations and the SDR is tight for the first problem, thus greatly simplifying the AP precoding design. Then, by capitalizing on the techniques of alternating optimization (AO), SDR, and successive convex approximation (SCA), computationally efficient algorithms are developed to obtain suboptimal solutions of the resulting optimization problems. Simulation results demonstrate that, given the same total system power budget, significant performance gains in terms of operating range of wireless power transfer (WPT), total harvested energy, as well as achievable rate can be obtained by our proposed designs over benchmark schemes (especially the one adopting a passive IRS). Moreover, it is advisable to deploy an active IRS in the proximity of the users for the effective operation of WPT/SWIPT.

優化器 · 邊 · 邊緣計算 · 塊坐標下降 · 坐標下降 ·

2022 年 6 月 9 日

Min-Max Latency Optimization for IRS-aided Cell-Free Mobile Edge Computing Systems

Nana Li,Wanming Hao,Fuhui Zhou,Shouyi Yang,Naofal Al-Dhahir

from arxiv, 13 pages

Mobile-edge computing (MEC) is expected to provide low-latency computation service for wireless devices (WDs). However, when WDs are located at cell edge or communication links between base stations (BSs) and WDs are blocked, the offloading latency will be large. To address this issue, we propose an intelligent reflecting surface (IRS)-assisted cell-free MEC system consisting of multiple BSs and IRSs for improving the transmission environment. Consequently, we formulate a min-max latency optimization problem by jointly designing multi-user detection (MUD) matrices, IRSs' reflecting beamforming vectors, WDs' transmit power and edge computing resource, subject to constraints on edge computing capability and IRSs phase shifts. To solve it, an alternating optimization algorithm based on the block coordinate descent (BCD) technique is proposed, in which the original non-convex problem is decoupled into two subproblems for alternately optimizing computing and communication parameters. In particular, we optimize the MUD matrix based on the second-order cone programming (SOCP) technique, and then develop two efficient algorithms to optimize IRSs' reflecting vectors based on the semi-definite relaxation (SDR) and successive convex approximation (SCA) techniques, respectively. Numerical results show that employing IRSs in cell-free MEC systems outperforms conventional MEC systems, resulting in up to about 60% latency reduction can be attained. Moreover, numerical results confirm that our proposed algorithms enjoy a fast convergence, which is beneficial for practical implementation.

Learning · 強化學習 · 情景 · 設計 · AIM ·

2022 年 6 月 8 日

Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines

Anna L. Trella,Kelly W. Zhang,Inbal Nahum-Shani,Vivek Shetty,Finale Doshi-Velez,Susan A. Murphy

Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurate mechanistic models for the user dynamics. To guide how one can tackle these challenges, we extend the PCS (Predictability, Computability, Stability) framework, a data science framework that incorporates best practices from machine learning and statistics in supervised learning (Yu and Kumbier, 2020), to the design of RL algorithms for the digital interventions setting. Further, we provide guidelines on how to design simulation environments, a crucial tool for evaluating RL candidate algorithms using the PCS framework. We illustrate the use of the PCS framework for designing an RL algorithm for Oralytics, a mobile health study aiming to improve users' tooth-brushing behaviors through the personalized delivery of intervention messages. Oralytics will go into the field in late 2022.

NOMA · Performance · 通道 · 可行 · Performer ·

2022 年 6 月 8 日

NOMA-based Improper Signaling for Multicell MISO RIS-assisted Broadcast Channels

Mohammad Soleymani,Ignacio Santamaria,Eduard Jorswieck,Sepehr Rezvani

In this paper, we study the performance of reconfigurable intelligent surfaces (RISs) in a multicell broadcast channel (BC) that employs improper Gaussian signaling (IGS) jointly with non-orthogonal multiple access (NOMA) to optimize either the minimum-weighted rate or the energy efficiency (EE) of the network. We show that although the RIS can significantly improve the system performance, it cannot mitigate interference completely, so we have to employ other interference-management techniques to further improve performance. We show that the proposed NOMA-based IGS scheme can substantially outperform proper Gaussian signaling (PGS) and IGS schemes that treat interference as noise (TIN) in particular when the number of users per cell is larger than the number of base station (BS) antennas (referred to as overloaded networks). In other words, IGS and NOMA complement to each other as interference management techniques in multicell RIS-assisted BCs. Furthermore, we consider three different feasibility sets for the RIS components showing that even a RIS with a small number of elements provides considerable gains for all the feasibility sets.

學成 · CAV · 控制器 · 可約的 · Institute of Deep Learning ·

2022 年 4 月 26 日

Learning Eco-Driving Strategies at Signalized Intersections

Vindula Jayawardana,Cathy Wu

Signalized intersections in arterial roads result in persistent vehicle idling and excess accelerations, contributing to fuel consumption and CO2 emissions. There has thus been a line of work studying eco-driving control strategies to reduce fuel consumption and emission levels at intersections. However, methods to devise effective control strategies across a variety of traffic settings remain elusive. In this paper, we propose a reinforcement learning (RL) approach to learn effective eco-driving control strategies. We analyze the potential impact of a learned strategy on fuel consumption, CO2 emission, and travel time and compare with naturalistic driving and model-based baselines. We further demonstrate the generalizability of the learned policies under mixed traffic scenarios. Simulation results indicate that scenarios with 100% penetration of connected autonomous vehicles (CAV) may yield as high as 18% reduction in fuel consumption and 25% reduction in CO2 emission levels while even improving travel speed by 20%. Furthermore, results indicate that even 25% CAV penetration can bring at least 50% of the total fuel and emission reduction benefits.

推薦系統 · 學成 · 強化學習 · 策略搜索 · INTERACT ·

2021 年 9 月 22 日

A Survey on Reinforcement Learning for Recommender Systems

Yuanguo Lin,Yong Liu,Fan Lin,Pengcheng Wu,Wenhua Zeng,Chunyan Miao

from arxiv, 25 pages, 4 figures

Recommender systems have been widely applied in different real-life scenarios to help us find useful information. Recently, Reinforcement Learning (RL) based recommender systems have become an emerging research topic. It often surpasses traditional recommendation models even most deep learning-based methods, owing to its interactive nature and autonomous learning ability. Nevertheless, there are various challenges of RL when applying in recommender systems. Toward this end, we firstly provide a thorough overview, comparisons, and summarization of RL approaches for five typical recommendation scenarios, following three main categories of RL: value-function, policy search, and Actor-Critic. Then, we systematically analyze the challenges and relevant solutions on the basis of existing literature. Finally, under discussion for open issues of RL and its limitations of recommendation, we highlight some potential research directions in this field.

MoDELS · INFORMS · 分解的 · 推薦系統 · 剪枝 ·

2021 年 2 月 20 日

$FM^2$: Field-matrixed Factorization Machines for Recommender Systems

Yang Sun,Junwei Pan,Alex Zhang,Aaron Flores

from arxiv, In Proceedings of the Web Conference 2021 (WWW 2021), April 19-23, 2021, Ljubljana, Slovenia. 10 pages

Click-through rate (CTR) prediction plays a critical role in recommender systems and online advertising. The data used in these applications are multi-field categorical data, where each feature belongs to one field. Field information is proved to be important and there are several works considering fields in their models. In this paper, we proposed a novel approach to model the field information effectively and efficiently. The proposed approach is a direct improvement of FwFM, and is named as Field-matrixed Factorization Machines (FmFM, or $FM^2$). We also proposed a new explanation of FM and FwFM within the FmFM framework, and compared it with the FFM. Besides pruning the cross terms, our model supports field-specific variable dimensions of embedding vectors, which acts as soft pruning. We also proposed an efficient way to minimize the dimension while keeping the model performance. The FmFM model can also be optimized further by caching the intermediate vectors, and it only takes thousands of floating-point operations (FLOPs) to make a prediction. Our experiment results show that it can out-perform the FFM, which is more complex. The FmFM model's performance is also comparable to DNN models which require much more FLOPs in runtime.

有向 · Better · Feel · 情景 · AI ·

2020 年 3 月 17 日

Directions for Explainable Knowledge-Enabled Systems

Shruthi Chari,Daniel M. Gruen,Oshani Seneviratne,Deborah L. McGuinness

from arxiv, S. Chari, D. M. Gruen, O. Seneviratne, D. L. McGuinness, "Directions for Explainable Knowledge-Enabled Systems". In: Ilaria Tiddi, Freddy Lecue, Pascal Hitzler (eds.), Knowledge Graphs for eXplainable AI -- Foundations, Applications and Challenges. Studies on the Semantic Web, IOS Press, Amsterdam, 2020, to appear

Interest in the field of Explainable Artificial Intelligence has been growing for decades and has accelerated recently. As Artificial Intelligence models have become more complex, and often more opaque, with the incorporation of complex machine learning techniques, explainability has become more critical. Recently, researchers have been investigating and tackling explainability with a user-centric focus, looking for explanations to consider trustworthiness, comprehensibility, explicit provenance, and context-awareness. In this chapter, we leverage our survey of explanation literature in Artificial Intelligence and closely related fields and use these past efforts to generate a set of explanation types that we feel reflect the expanded needs of explanation for today's artificial intelligence applications. We define each type and provide an example question that would motivate the need for this style of explanation. We believe this set of explanation types will help future system designers in their generation and prioritization of requirements and further help generate explanations that are better aligned to users' and situational needs.

強化學習 · 學成 · tuning · 回合 · 有向 ·

2020 年 1 月 19 日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Amit Kumar Mondal,Nadeem Jamali

Reinforcement learning is one of the core components in designing an artificial intelligent system emphasizing real-time response. Reinforcement learning influences the system to take actions within an arbitrary environment either having previous knowledge about the environment model or not. In this paper, we present a comprehensive study on Reinforcement Learning focusing on various dimensions including challenges, the recent development of different state-of-the-art techniques, and future directions. The fundamental objective of this paper is to provide a framework for the presentation of available methods of reinforcement learning that is informative enough and simple to follow for the new researchers and academics in this domain considering the latest concerns. First, we illustrated the core techniques of reinforcement learning in an easily understandable and comparable way. Finally, we analyzed and depicted the recent developments in reinforcement learning approaches. My analysis pointed out that most of the models focused on tuning policy values rather than tuning other things in a particular state of reasoning.

深度強化學習 · 強化學習 · 學成 · 回合 · 優化器 ·

2018 年 6 月 27 日

A Multi-Objective Deep Reinforcement Learning Framework

Thanh Thi Nguyen

from arxiv, 17 pages

This paper presents a new multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We propose the use of linear and non-linear methods to develop the MODRL framework that includes both single-policy and multi-policy strategies. The experimental results on two benchmark problems including the two-objective deep sea treasure environment and the three-objective mountain car problem indicate that the proposed framework is able to converge to the optimal Pareto solutions effectively. The proposed framework is generic, which allows implementation of different deep reinforcement learning algorithms in different complex environments. This therefore overcomes many difficulties involved with standard multi-objective reinforcement learning (MORL) methods existing in the current literature. The framework creates a platform as a testbed environment to develop methods for solving various problems associated with the current MORL. Details of the framework implementation can be referred to //www.deakin.edu.au/~thanhthi/drl.htm.