亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The design of autonomous agents that can interact effectively with other agents without prior coordination is a core problem in multi-agent systems. Type-based reasoning methods achieve this by maintaining a belief over a set of potential behaviours for the other agents. However, current methods are limited in that they assume full observability of the state and actions of the other agent or do not scale efficiently to larger problems with longer planning horizons. Addressing these limitations, we propose Partially Observable Type-based Meta Monte-Carlo Planning (POTMMCP) - an online Monte-Carlo Tree Search based planning method for type-based reasoning in large partially observable environments. POTMMCP incorporates a novel meta-policy for guiding search and evaluating beliefs, allowing it to search more effectively to longer horizons using less planning time. We show that our method converges to the optimal solution in the limit and empirically demonstrate that it effectively adapts online to diverse sets of other agents across a range of environments. Comparisons with the state-of-the art method on problems with up to $10^{14}$ states and $10^8$ observations indicate that POTMMCP is able to compute better solutions significantly faster.

相關內容

Current motion planning approaches for autonomous mobile robots often assume that the low level controller of the system is able to track the planned motion with very high accuracy. In practice, however, tracking error can be affected by many factors, and could lead to potential collisions when the robot must traverse a cluttered environment. To address this problem, this paper proposes a novel receding-horizon motion planning approach based on Model Predictive Path Integral (MPPI) control theory -- a flexible sampling-based control technique that requires minimal assumptions on vehicle dynamics and cost functions. This flexibility is leveraged to propose a motion planning framework that also considers a data-informed risk function. Using the MPPI algorithm as a motion planner also reduces the number of samples required by the algorithm, relaxing the hardware requirements for implementation. The proposed approach is validated through trajectory generation for a quadrotor unmanned aerial vehicle (UAV), where fast motion increases trajectory tracking error and can lead to collisions with nearby obstacles. Simulations and hardware experiments demonstrate that the MPPI motion planner proactively adapts to the obstacles that the UAV must negotiate, slowing down when near obstacles and moving quickly when away from obstacles, resulting in a complete reduction of collisions while still producing lively motion.

Selecting the most suitable activation function is a critical factor in the effectiveness of deep learning models, as it influences their learning capacity, stability, and computational efficiency. In recent years, the Gaussian Error Linear Unit (GELU) activation function has emerged as a dominant method, surpassing traditional functions such as the Rectified Linear Unit (ReLU) in various applications. This study presents a rigorous mathematical investigation of the GELU activation function, exploring its differentiability, boundedness, stationarity, and smoothness properties in detail. Additionally, we conduct an extensive experimental comparison of the GELU function against a broad range of alternative activation functions, utilizing a residual convolutional network trained on the CIFAR-10, CIFAR-100, and STL-10 datasets as the empirical testbed. Our results demonstrate the superior performance of GELU compared to other activation functions, establishing its suitability for a wide range of deep learning applications. This comprehensive study contributes to a more profound understanding of the underlying mathematical properties of GELU and provides valuable insights for practitioners aiming to select activation functions that optimally align with their specific objectives and constraints in deep learning.

A learning-based modular motion planning pipeline is presented that is compliant, safe, and reactive to perturbations at task execution. A nominal motion plan, defined as a nonlinear autonomous dynamical system (DS), is learned offline from kinesthetic demonstrations using a Neural Ordinary Differential Equation (NODE) model. To ensure both stability and safety during inference, a novel approach is proposed which selects a target point at each time step for the robot to follow, using a time-varying target trajectory generated by the learned NODE. A correction term to the NODE model is computed online by solving a Quadratic Program that guarantees stability and safety using Control Lyapunov Functions and Control Barrier Functions, respectively. Our approach outperforms baseline DS learning techniques on the LASA handwriting dataset and is validated on real-robot experiments where it is shown to produce stable motions, such as wiping and stirring, while being robust to physical perturbations and safe around humans and obstacles.

To effectively process data across a fleet of dynamic and distributed vehicles, it is crucial to implement resource provisioning techniques that provide reliable, cost-effective, and real-time computing services. This article explores resource provisioning for computation-intensive tasks over mobile vehicular clouds (MVCs). We use undirected weighted graphs (UWGs) to model both the execution of tasks and communication patterns among vehicles in a MVC. We then study low-latency and reliable scheduling of UWG asks through a novel methodology named double-plan-promoted isomorphic subgraph search and optimization (DISCO). In DISCO, two complementary plans are envisioned to ensure effective task completion: Plan A and Plan B.Plan A analyzes the past data to create an optimal mapping ($\alpha$) between tasks and the MVC in advance to the practical task scheduling. Plan B serves as a dependable backup, designed to find a feasible mapping ($\beta$) in case $\alpha$ fails during task scheduling due to unpredictable nature of the network.We delve into into DISCO's procedure and key factors that contribute to its success. Additionally, we provide a case study that includes comprehensive comparisons to demonstrate DISCO's exceptional performance in regards to time efficiency and overhead. We further discuss a series of open directions for future research.

We consider the problem of learning control policies in discrete-time stochastic systems which guarantee that the system stabilizes within some specified stabilization region with probability~$1$. Our approach is based on the novel notion of stabilizing ranking supermartingales (sRSMs) that we introduce in this work. Our sRSMs overcome the limitation of methods proposed in previous works whose applicability is restricted to systems in which the stabilizing region cannot be left once entered under any control policy. We present a learning procedure that learns a control policy together with an sRSM that formally certifies probability~$1$ stability, both learned as neural networks. We show that this procedure can also be adapted to formally verifying that, under a given Lipschitz continuous control policy, the stochastic system stabilizes within some stabilizing region with probability~$1$. Our experimental evaluation shows that our learning procedure can successfully learn provably stabilizing policies in practice.

Rather than traditional position control, impedance control is preferred to ensure the safe operation of industrial robots programmed from demonstrations. However, variable stiffness learning studies have focused on task performance rather than safety (or compliance). Thus, this paper proposes a novel stiffness learning method to satisfy both task performance and compliance requirements. The proposed method optimizes the task and compliance objectives (T/C objectives) simultaneously via multi-objective Bayesian optimization. We define the stiffness search space by segmenting a demonstration into task phases, each with constant responsible stiffness. The segmentation is performed by identifying impedance control-aware switching linear dynamics (IC-SLD) from the demonstration. We also utilize the stiffness obtained by proposed IC-SLD as priors for efficient optimization. Experiments on simulated tasks and a real robot demonstrate that IC-SLD-based segmentation and the use of priors improve the optimization efficiency compared to existing baseline methods.

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning. Research problems include scalable learning of coordinated agent policies and inter-agent communication; reasoning about the behaviours, goals, and composition of other agents from limited observations; and sample-efficient learning based on intrinsic motivation, curriculum learning, causal inference, and representation learning. This article provides a broad overview of the ongoing research portfolio of the group and discusses open problems for future directions.

Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.

We address the task of automatically scoring the competency of candidates based on textual features, from the automatic speech recognition (ASR) transcriptions in the asynchronous video job interview (AVI). The key challenge is how to construct the dependency relation between questions and answers, and conduct the semantic level interaction for each question-answer (QA) pair. However, most of the recent studies in AVI focus on how to represent questions and answers better, but ignore the dependency information and interaction between them, which is critical for QA evaluation. In this work, we propose a Hierarchical Reasoning Graph Neural Network (HRGNN) for the automatic assessment of question-answer pairs. Specifically, we construct a sentence-level relational graph neural network to capture the dependency information of sentences in or between the question and the answer. Based on these graphs, we employ a semantic-level reasoning graph attention network to model the interaction states of the current QA session. Finally, we propose a gated recurrent unit encoder to represent the temporal question-answer pairs for the final prediction. Empirical results conducted on CHNAT (a real-world dataset) validate that our proposed model significantly outperforms text-matching based benchmark models. Ablation studies and experimental results with 10 random seeds also show the effectiveness and stability of our models.

北京阿比特科技有限公司