亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study emergent communication in a multi-agent reinforcement learning setting, where the agents solve cooperative tasks and have access to a communication channel. The communication channel may consist of either discrete symbols or continuous variables. We introduce an inductive bias to aid with the emergence of good communication protocols for continuous messages, and we look at the effect this type of inductive bias has for continuous and discrete messages in itself or when used in combination with reinforcement learning. We demonstrate that this type of inductive bias has a beneficial effect on the communication protocols learnt in two toy environments, Negotiation and Sequence Guess.

相關內容

Residual connections have been proposed as architecture-based inductive bias to mitigate the problem of exploding and vanishing gradients and increase task performance in both feed-forward and recurrent networks (RNNs) when trained with the backpropagation algorithm. Yet, little is known about how residual connections in RNNs influence their dynamics and fading memory properties. Here, we introduce weakly coupled residual recurrent networks (WCRNNs) in which residual connections result in well-defined Lyapunov exponents and allow for studying properties of fading memory. We investigate how the residual connections of WCRNNs influence their performance, network dynamics, and memory properties on a set of benchmark tasks. We show that several distinct forms of residual connections yield effective inductive biases that result in increased network expressivity. In particular, residual connections that (i) result in network dynamics at the proximity of the edge of chaos, (ii) allow networks to capitalize on characteristic spectral properties of the data, and (iii) result in heterogeneous memory properties are shown to increase practical expressivity. In addition, we demonstrate how our results can be extended to non-linear residuals and introduce a weakly coupled residual initialization scheme that can be used for Elman RNNs

The ever-growing use of wind energy makes necessary the optimization of turbine operations through pitch angle controllers and their maintenance with early fault detection. It is crucial to have accurate and robust models imitating the behavior of wind turbines, especially to predict the generated power as a function of the wind speed. Existing empirical and physics-based models have limitations in capturing the complex relations between the input variables and the power, aggravated by wind variability. Data-driven methods offer new opportunities to enhance wind turbine modeling of large datasets by improving accuracy and efficiency. In this study, we used physics-informed neural networks to reproduce historical data coming from 4 turbines in a wind farm, while imposing certain physical constraints to the model. The developed models for regression of the power, torque, and power coefficient as output variables showed great accuracy for both real data and physical equations governing the system. Lastly, introducing an efficient evidential layer provided uncertainty estimations of the predictions, proved to be consistent with the absolute error, and made possible the definition of a confidence interval in the power curve.

We first define appropriate state representation and action space, and then design an adjustment mechanism based on the actions selected by the intelligent agent. The adjustment mechanism outputs the next state and reward value of the agent. Additionally, the adjustment mechanism calculates the error between the adjusted state and the unadjusted state. Furthermore, the intelligent agent stores the acquired experience samples containing states and reward values in a buffer and replays the experiences during each iteration to learn the dynamic characteristics of the environment. We name the improved algorithm as the DQM algorithm. Experimental results demonstrate that the intelligent agent using our proposed algorithm effectively reduces the accumulated errors of inertial navigation in dynamic environments. Although our research provides a basis for achieving autonomous navigation of unmanned aerial vehicles, there is still room for significant optimization. Further research can include testing unmanned aerial vehicles in simulated environments, testing unmanned aerial vehicles in real-world environments, optimizing the design of reward functions, improving the algorithm workflow to enhance convergence speed and performance, and enhancing the algorithm's generalization ability.

The past few years have seen rapid progress in combining reinforcement learning (RL) with deep learning. Various breakthroughs ranging from games to robotics have spurred the interest in designing sophisticated RL algorithms and systems. However, the prevailing workflow in RL is to learn tabula rasa, which may incur computational inefficiency. This precludes continuous deployment of RL algorithms and potentially excludes researchers without large-scale computing resources. In many other areas of machine learning, the pretraining paradigm has shown to be effective in acquiring transferable knowledge, which can be utilized for a variety of downstream tasks. Recently, we saw a surge of interest in Pretraining for Deep RL with promising results. However, much of the research has been based on different experimental settings. Due to the nature of RL, pretraining in this field is faced with unique challenges and hence requires new design principles. In this survey, we seek to systematically review existing works in pretraining for deep reinforcement learning, provide a taxonomy of these methods, discuss each sub-field, and bring attention to open problems and future directions.

Edge computing facilitates low-latency services at the network's edge by distributing computation, communication, and storage resources within the geographic proximity of mobile and Internet-of-Things (IoT) devices. The recent advancement in Unmanned Aerial Vehicles (UAVs) technologies has opened new opportunities for edge computing in military operations, disaster response, or remote areas where traditional terrestrial networks are limited or unavailable. In such environments, UAVs can be deployed as aerial edge servers or relays to facilitate edge computing services. This form of computing is also known as UAV-enabled Edge Computing (UEC), which offers several unique benefits such as mobility, line-of-sight, flexibility, computational capability, and cost-efficiency. However, the resources on UAVs, edge servers, and IoT devices are typically very limited in the context of UEC. Efficient resource management is, therefore, a critical research challenge in UEC. In this article, we present a survey on the existing research in UEC from the resource management perspective. We identify a conceptual architecture, different types of collaborations, wireless communication models, research directions, key techniques and performance indicators for resource management in UEC. We also present a taxonomy of resource management in UEC. Finally, we identify and discuss some open research challenges that can stimulate future research directions for resource management in UEC.

Maritime activities represent a major domain of economic growth with several emerging maritime Internet of Things use cases, such as smart ports, autonomous navigation, and ocean monitoring systems. The major enabler for this exciting ecosystem is the provision of broadband, low-delay, and reliable wireless coverage to the ever-increasing number of vessels, buoys, platforms, sensors, and actuators. Towards this end, the integration of unmanned aerial vehicles (UAVs) in maritime communications introduces an aerial dimension to wireless connectivity going above and beyond current deployments, which are mainly relying on shore-based base stations with limited coverage and satellite links with high latency. Considering the potential of UAV-aided wireless communications, this survey presents the state-of-the-art in UAV-aided maritime communications, which, in general, are based on both conventional optimization and machine-learning-aided approaches. More specifically, relevant UAV-based network architectures are discussed together with the role of their building blocks. Then, physical-layer, resource management, and cloud/edge computing and caching UAV-aided solutions in maritime environments are discussed and grouped based on their performance targets. Moreover, as UAVs are characterized by flexible deployment with high re-positioning capabilities, studies on UAV trajectory optimization for maritime applications are thoroughly discussed. In addition, aiming at shedding light on the current status of real-world deployments, experimental studies on UAV-aided maritime communications are presented and implementation details are given. Finally, several important open issues in the area of UAV-aided maritime communications are given, related to the integration of sixth generation (6G) advancements.

In the past decade, we have witnessed the rise of deep learning to dominate the field of artificial intelligence. Advances in artificial neural networks alongside corresponding advances in hardware accelerators with large memory capacity, together with the availability of large datasets enabled researchers and practitioners alike to train and deploy sophisticated neural network models that achieve state-of-the-art performance on tasks across several fields spanning computer vision, natural language processing, and reinforcement learning. However, as these neural networks become bigger, more complex, and more widely used, fundamental problems with current deep learning models become more apparent. State-of-the-art deep learning models are known to suffer from issues that range from poor robustness, inability to adapt to novel task settings, to requiring rigid and inflexible configuration assumptions. Ideas from collective intelligence, in particular concepts from complex systems such as self-organization, emergent behavior, swarm optimization, and cellular systems tend to produce solutions that are robust, adaptable, and have less rigid assumptions about the environment configuration. It is therefore natural to see these ideas incorporated into newer deep learning methods. In this review, we will provide a historical context of neural network research's involvement with complex systems, and highlight several active areas in modern deep learning research that incorporate the principles of collective intelligence to advance its current capabilities. To facilitate a bi-directional flow of ideas, we also discuss work that utilize modern deep learning models to help advance complex systems research. We hope this review can serve as a bridge between complex systems and deep learning communities to facilitate the cross pollination of ideas and foster new collaborations across disciplines.

Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of targe data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at //github.com/BIT-DA/EADA.

Relation prediction for knowledge graphs aims at predicting missing relationships between entities. Despite the importance of inductive relation prediction, most previous works are limited to a transductive setting and cannot process previously unseen entities. The recent proposed subgraph-based relation reasoning models provided alternatives to predict links from the subgraph structure surrounding a candidate triplet inductively. However, we observe that these methods often neglect the directed nature of the extracted subgraph and weaken the role of relation information in the subgraph modeling. As a result, they fail to effectively handle the asymmetric/anti-symmetric triplets and produce insufficient embeddings for the target triplets. To this end, we introduce a \textbf{C}\textbf{o}mmunicative \textbf{M}essage \textbf{P}assing neural network for \textbf{I}nductive re\textbf{L}ation r\textbf{E}asoning, \textbf{CoMPILE}, that reasons over local directed subgraph structures and has a vigorous inductive bias to process entity-independent semantic relations. In contrast to existing models, CoMPILE strengthens the message interactions between edges and entitles through a communicative kernel and enables a sufficient flow of relation information. Moreover, we demonstrate that CoMPILE can naturally handle asymmetric/anti-symmetric relations without the need for explosively increasing the number of model parameters by extracting the directed enclosing subgraphs. Extensive experiments show substantial performance gains in comparison to state-of-the-art methods on commonly used benchmark datasets with variant inductive settings.

This paper presents a new multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We propose the use of linear and non-linear methods to develop the MODRL framework that includes both single-policy and multi-policy strategies. The experimental results on two benchmark problems including the two-objective deep sea treasure environment and the three-objective mountain car problem indicate that the proposed framework is able to converge to the optimal Pareto solutions effectively. The proposed framework is generic, which allows implementation of different deep reinforcement learning algorithms in different complex environments. This therefore overcomes many difficulties involved with standard multi-objective reinforcement learning (MORL) methods existing in the current literature. The framework creates a platform as a testbed environment to develop methods for solving various problems associated with the current MORL. Details of the framework implementation can be referred to //www.deakin.edu.au/~thanhthi/drl.htm.

北京阿比特科技有限公司