销魂美女一区二区三区AV_日本中文字幕高清专区久久_国产黄色AV免费在线看_亚洲国产中文在线精品国_掌上啪啪先锋影音资源站_99免费人成视在线观看不卡_久操国视频在线观看

Replanners are efficient methods for solving non-deterministic planning problems. Despite showing good scalability, existing replanners often fail to solve problems involving a large number of misleading plans, i.e., weak plans that do not lead to strong solutions, however, due to their minimal lengths, are likely to be found at every replanning iteration. The poor performance of replanners in such problems is due to their all-outcome determinization. That is, when compiling from non-deterministic to classical, they include all compiled classical operators in a single deterministic domain which leads replanners to continually generate misleading plans. We introduce an offline replanner, called Safe-Planner (SP), that relies on a single-outcome determinization to compile a non-deterministic domain to a set of classical domains, and ordering heuristics for ranking the obtained classical domains. The proposed single-outcome determinization and the heuristics allow for alternating between different classical domains. We show experimentally that this approach can allow SP to avoid generating misleading plans but to generate weak plans that directly lead to strong solutions. The experiments show that SP outperforms state-of-the-art non-deterministic solvers by solving a broader range of problems. We also validate the practical utility of SP in real-world non-deterministic robotic tasks.

相關內容

編譯(yi)器

關注 10

編(bian)譯器（Compiler），是一(yi)(yi)種計(ji)算(suan)機程序(xu)，它(ta)會將用某種編(bian)程語(yu)(yu)言(yan)寫(xie)成(cheng)的源代碼（原始語(yu)(yu)言(yan)），轉換成(cheng)另一(yi)(yi)種編(bian)程語(yu)(yu)言(yan)（目標語(yu)(yu)言(yan)）。

泛化理論 · 統計量 · 約束優化 · 預測器/決策函數 · 非凸 ·

2021 年 11 月 15 日

Model-Based Domain Generalization

Alexander Robey,George J. Pappas,Hamed Hassani

Despite remarkable success in a variety of applications, it is well-known that deep learning can fail catastrophically when presented with out-of-distribution data. Toward addressing this challenge, we consider the domain generalization problem, wherein predictors are trained using data drawn from a family of related training domains and then evaluated on a distinct and unseen test domain. We show that under a natural model of data generation and a concomitant invariance condition, the domain generalization problem is equivalent to an infinite-dimensional constrained statistical learning problem; this problem forms the basis of our approach, which we call Model-Based Domain Generalization. Due to the inherent challenges in solving constrained optimization problems in deep learning, we exploit nonconvex duality theory to develop unconstrained relaxations of this statistical problem with tight bounds on the duality gap. Based on this theoretical motivation, we propose a novel domain generalization algorithm with convergence guarantees. In our experiments, we report improvements of up to 30 percentage points over state-of-the-art domain generalization baselines on several benchmarks including ColoredMNIST, Camelyon17-WILDS, FMoW-WILDS, and PACS.

得分 · Performer · Seven · Weight · 情景 ·

2021 年 11 月 14 日

Methods for Combining and Representing Non-Contextual Autonomy Scores for Unmanned Aerial Systems

Brendan Hertel,Ryan Donald,Christian Dumas,S. Reza Ahmadzadeh

from arxiv, 8 pages, 2 figures, 6 tables

Measuring an overall autonomy score for a robotic system requires the combination of a set of relevant aspects and features of the system that might be measured in different units, qualitative, and/or discordant. In this paper, we build upon an existing non-contextual autonomy framework that measures and combines the Autonomy Level and the Component Performance of a system as overall autonomy score. We examine several methods of combining features, showing how some methods find different rankings of the same data, and we employ the weighted product method to resolve this issue. Furthermore, we introduce the non-contextual autonomy coordinate and represent the overall autonomy of a system with an autonomy distance. We apply our method to a set of seven Unmanned Aerial Systems (UAS) and obtain their absolute autonomy score as well as their relative score with respect to the best system.

Neural Networks · 學成 · 控制器 · 方差減小 · Networking ·

2021 年 11 月 12 日

Deep controlled learning of dynamic policies with an application to lost-sales inventory control

Willem van Jaarsveld

Recent literature established that neural networks can represent good policies across a range of stochastic dynamic models in supply chain and logistics. We propose a new algorithm that incorporates variance reduction techniques, to overcome limitations of algorithms typically employed in literature to learn such neural network policies. For the classical lost sales inventory model, the algorithm learns neural network policies that are vastly superior to those learned using model-free algorithms, while outperforming the best heuristic benchmarks by an order of magnitude. The algorithm is an interesting candidate to apply to other stochastic dynamic problems in supply chain and logistics, because the ideas in its development are generic.

可約的 · MoDELS · SimPLe · 控制器 · Performer ·

2021 年 11 月 12 日

Out of Control: Reducing Probabilistic Models by Control-State Elimination

Tobias Winkler,Johannes Lehmann,Joost-Pieter Katoen

from arxiv, full version including proofs, 33 pages

State-of-the-art probabilistic model checkers perform verification on explicit-state Markov models defined in a high-level programming formalism like the PRISM modeling language. Typically, the low-level models resulting from such program-like specifications exhibit lots of structure such as repeating subpatterns. Established techniques like probabilistic bisimulation minimization are able to exploit these structures; however, they operate directly on the explicit-state model. On the other hand, methods for reducing structured state spaces by reasoning about the high-level program have not been investigated that much. In this paper, we present a new, simple, and fully automatic program-level technique to reduce the underlying Markov model. Our approach aims at computing the summary behavior of adjacent locations in the program's control-flow graph, thereby obtaining a program with fewer "control states". This reduction is immediately reflected in the program's operational semantics, enabling more efficient model checking. A key insight is that in principle, each (combination of) program variable(s) with finite domain can play the role of the program counter that defines the flow structure. Unlike most other reduction techniques, our approach is property-directed and naturally supports unspecified model parameters. Experiments demonstrate that our simple method yields state-space reductions of up to 80% on practically relevant benchmarks.

賭博機/老虎機 · 機器人 · Performer · 價值函數 · 操作 ·

2021 年 11 月 11 日

Scalable Operator Allocation for Multi-Robot Assistance: A Restless Bandit Approach

Abhinav Dahiya,Nima Akbarzadeh,Aditya Mahajan,Stephen L. Smith

from arxiv, 11 pages + 4 page Appendix, 7 Figures

In this paper, we consider the problem of allocating human operators in a system with multiple semi-autonomous robots. Each robot is required to perform an independent sequence of tasks, subjected to a chance of failing and getting stuck in a fault state at every task. If and when required, a human operator can assist or teleoperate a robot. Conventional MDP techniques used to solve such problems face scalability issues due to exponential growth of state and action spaces with the number of robots and operators. In this paper we derive conditions under which the operator allocation problem is indexable, enabling the use of the Whittle index heuristic. The conditions can be easily checked to verify indexability, and we show that they hold for a wide range of problems of interest. Our key insight is to leverage the structure of the value function of individual robots, resulting in conditions that can be verified separately for each state of each robot. We apply these conditions to two types of transitions commonly seen in remote robot supervision systems. Through numerical simulations, we demonstrate the efficacy of Whittle index policy as a near-optimal and scalable approach that outperforms existing scalable methods.

Performer · MoDELS · 學成 · Continuity · 控制器 ·

2021 年 7 月 8 日

Imitation by Predicting Observations

Andrew Jaegle,Yury Sulsky,Arun Ahuja,Jake Bruce,Rob Fergus,Greg Wayne

from arxiv, ICML 2021

Imitation learning enables agents to reuse and adapt the hard-won expertise of others, offering a solution to several key challenges in learning behavior. Although it is easy to observe behavior in the real-world, the underlying actions may not be accessible. We present a new method for imitation solely from observations that achieves comparable performance to experts on challenging continuous control tasks while also exhibiting robustness in the presence of observations unrelated to the task. Our method, which we call FORM (for "Future Observation Reward Model") is derived from an inverse RL objective and imitates using a model of expert behavior learned by generative modelling of the expert's observations, without needing ground truth actions. We show that FORM performs comparably to a strong baseline IRL method (GAIL) on the DeepMind Control Suite benchmark, while outperforming GAIL in the presence of task-irrelevant features.

小樣本學習 · 目標領域 · MoDELS · 標注 · 統計量 ·

2020 年 8 月 19 日

Few-shot Domain Adaptation by Causal Mechanism Transfer

Takeshi Teshima,Issei Sato,Masashi Sugiyama

from arxiv, 33 pages, 3 figures. Camera-ready version for Thirty-seventh International Conference on Machine Learning (ICML 2020)

We study few-shot supervised domain adaptation (DA) for regression problems, where only a few labeled target domain data and many labeled source domain data are available. Many of the current DA methods base their transfer assumptions on either parametrized distribution shift or apparent distribution similarities, e.g., identical conditionals or small distributional discrepancies. However, these assumptions may preclude the possibility of adaptation from intricately shifted and apparently very different distributions. To overcome this problem, we propose mechanism transfer, a meta-distributional scenario in which a data generating mechanism is invariant among domains. This transfer assumption can accommodate nonparametric shifts resulting in apparently different distributions while providing a solid statistical basis for DA. We take the structural equations in causal modeling as an example and propose a novel DA method, which is shown to be useful both theoretically and experimentally. Our method can be seen as the first attempt to fully leverage the structural causal models for DA.

學成 · 示例 · Extensibility · 分離的 · 情景 ·

2018 年 10 月 17 日

Learning to Separate Domains in Generalized Zero-Shot and Open Set Learning: a probabilistic perspective

Hanze Dong,Yanwei Fu,Leonid Sigal,Sung Ju Hwang,Yu-Gang Jiang,Xiangyang Xue

from arxiv, 10 pages, 5 figures, submitted to ICLR 2019

This paper studies the problem of domain division problem which aims to segment instances drawn from different probabilistic distributions. Such a problem exists in many previous recognition tasks, such as Open Set Learning (OSL) and Generalized Zero-Shot Learning (G-ZSL), where the testing instances come from either seen or novel/unseen classes of different probabilistic distributions. Previous works focused on either only calibrating the confident prediction of classifiers of seen classes (W-SVM), or taking unseen classes as outliers. In contrast, this paper proposes a probabilistic way of directly estimating and fine-tuning the decision boundary between seen and novel/unseen classes. In particular, we propose a domain division algorithm of learning to split the testing instances into known, unknown and uncertain domains, and then conduct recognize tasks in each domain. Two statistical tools, namely, bootstrapping and Kolmogorov-Smirnov (K-S) Test, for the first time, are introduced to discover and fine-tune the decision boundary of each domain. Critically, the uncertain domain is newly introduced in our framework to adopt those instances whose domain cannot be predicted confidently. Extensive experiments demonstrate that our approach achieved the state-of-the-art performance on OSL and G-ZSL benchmarks.

假設空間 · 學成 · 機器人 · INTERACT · state-of-the-art ·

2018 年 10 月 11 日

Learning under Misspecified Objective Spaces

Andreea Bobu,Andrea Bajcsy,Jaime F. Fisac,Anca D. Dragan

from arxiv, Conference on Robot Learning (CoRL) 2018

Learning robot objective functions from human input has become increasingly important, but state-of-the-art techniques assume that the human's desired objective lies within the robot's hypothesis space. When this is not true, even methods that keep track of uncertainty over the objective fail because they reason about which hypothesis might be correct, and not whether any of the hypotheses are correct. We focus specifically on learning from physical human corrections during the robot's task execution, where not having a rich enough hypothesis space leads to the robot updating its objective in ways that the person did not actually intend. We observe that such corrections appear irrelevant to the robot, because they are not the best way of achieving any of the candidate objectives. Instead of naively trusting and learning from every human interaction, we propose robots learn conservatively by reasoning in real time about how relevant the human's correction is for the robot's hypothesis space. We test our inference method in an experiment with human interaction data, and demonstrate that this alleviates unintended learning in an in-person user study with a 7DoF robot manipulator.

domain shift · Extensibility · 可約的 · 基準 · MoDELS ·

2018 年 4 月 25 日

Strong Baselines for Neural Semi-supervised Learning under Domain Shift

Sebastian Ruder,Barbara Plank

from arxiv, ACL 2018

Novel neural models have been proposed in recent years for learning under domain shift. Most models, however, only evaluate on a single task, on proprietary datasets, or compare to weak baselines, which makes comparison of models difficult. In this paper, we re-evaluate classic general-purpose bootstrapping approaches in the context of neural networks under domain shifts vs. recent neural approaches and propose a novel multi-task tri-training method that reduces the time and space complexity of classic tri-training. Extensive experiments on two benchmarks are negative: while our novel method establishes a new state-of-the-art for sentiment analysis, it does not fare consistently the best. More importantly, we arrive at the somewhat surprising conclusion that classic tri-training, with some additions, outperforms the state of the art. We conclude that classic approaches constitute an important and strong baseline.