国产成人精品三级在线,日韩1区3区4区第一页

It is well known that reinforcement learning can be cast as inference in an appropriate probabilistic model. However, this commonly involves introducing a distribution over agent trajectories with probabilities proportional to exponentiated rewards. In this work, we formulate reinforcement learning as Bayesian inference without resorting to rewards, and show that rewards are derived from agent's preferences, rather than the other way around. We argue that agent preferences should be specified stochastically rather than deterministically. Reinforcement learning via inference with stochastic preferences naturally describes agent behaviors, does not require introducing rewards and exponential weighing of trajectories, and allows to reason about agents using the solid foundation of Bayesian statistics. Stochastic conditioning, a probabilistic programming paradigm for conditioning models on distributions rather than values, is the formalism behind agents with probabilistic preferences. We demonstrate realization of our approach on case studies using both a two-agent coordinate game and a single agent acting in a noisy environment, showing that despite superficial differences, both cases can be modeled and reasoned about based on the same principles.

相關內容

推斷

關注 5

分離的 · 流 · 優化器 · 輸入分布 · 近似 ·

2021 年 11 月 30 日

Separating k-Player from t-Player One-Way Communication, with Applications to Data Streams

Elbert Du,Michael Mitzenmacher,David P. Woodruff,Guang Yang

from arxiv, Preliminary version appeared in ICALP 2019, submitted to ToC

In a $k$-party communication problem, the $k$ players with inputs $x_1, x_2, \ldots, x_k$, respectively, want to evaluate a function $f(x_1, x_2, \ldots, x_k)$ using as little communication as possible. We consider the message-passing model, in which the inputs are partitioned in an arbitrary, possibly worst-case manner, among a smaller number $t$ of players ($t<k$). The $t$-player communication cost of computing $f$ can only be smaller than the $k$-player communication cost, since the $t$ players can trivially simulate the $k$-player protocol. But how much smaller can it be? We study deterministic and randomized protocols in the one-way model, and provide separations for product input distributions, which are optimal for low error probability protocols. We also provide much stronger separations when the input distribution is non-product. A key application of our results is in proving lower bounds for data stream algorithms. In particular, we give an optimal $\Omega(\epsilon^{-2}\log(N) \log \log(mM))$ bits of space lower bound for the fundamental problem of $(1\pm\epsilon)$-approximating the number $\|x\|_0$ of non-zero entries of an $n$-dimensional vector $x$ after $m$ integer updates each of magnitude at most $M$, and with success probability $\ge 2/3$, in a strict turnstile stream. We additionally prove the matching $\Omega(\epsilon^{-2}\log(N) \log \log(T))$ space lower bound for the problem when we have access to a heavy hitters oracle with threshold $T$. Our results match the best known upper bounds when $\epsilon\ge 1/\operatorname{polylog}(mM)$ and when $T = 2^{\operatorname{poly}(1/\epsilon)}$ respectively. It also improves on the prior $\Omega(\epsilon^{-2}\log(mM) )$ lower bound and separates the complexity of approximating $L_0$ from approximating the $p$-norm $L_p$ for $p$ bounded away from $0$, since the latter has an $O(\epsilon^{-2}\log (mM))$ bit upper bound.

回合 · 路徑 · Use Case · Continuity · INFORMS ·

2021 年 11 月 30 日

A Novel Occupancy Mapping Framework for Risk-Aware Path Planning in Unstructured Environments

Johann Laconte,Abderrahim Kasmi,Fran?ois Pomerleau,Roland Chapuis,Laurent Malaterre,Christophe Debain,Romuald Aufrère

from arxiv, Published in the Special Issue "Frontiers in Mobile Robot Navigation" of Sensors. //www.mdpi.com/1424-8220/21/22/7562

In the context of autonomous robots, one of the most important tasks is to prevent potential damage to the robot during navigation. For this purpose, it is often assumed that one must deal with known probabilistic obstacles, then compute the probability of collision with each obstacle. However, in complex scenarios or unstructured environments, it might be difficult to detect such obstacles. In these cases, a metric map is used, where each position stores the information of occupancy. The most common type of metric map is the Bayesian occupancy map. However, this type of map is not well suited for computing risk assessments for continuous paths due to its discrete nature. Hence, we introduce a novel type of map called the Lambda Field, which is specially designed for risk assessment. We first propose a way to compute such a map and the expectation of a generic risk over a path. Then, we demonstrate the benefits of our generic formulation with a use case defining the risk as the expected collision force over a path. Using this risk definition and the Lambda Field, we show that our framework is capable of doing classical path planning while having a physical-based metric. Furthermore, the Lambda Field gives a natural way to deal with unstructured environments, such as tall grass. Where standard environment representations would always generate trajectories going around such obstacles, our framework allows the robot to go through the grass while being aware of the risk taken.

回合 · INTERACT · 穩健性 · 容差 · Extensibility ·

2021 年 11 月 30 日

A framework to measure the robustness of programs in the unpredictable environment

Valentina Castiglioni,Michele Loreti,Simone Tini

Due to the diffusion of IoT, modern software systems are often thought to control and coordinate smart devices in order to manage assets and resources, and to guarantee efficient behaviours. For this class of systems, which interact extensively with humans and with their environment, it is thus crucial to guarantee their correct behaviour in order to avoid unexpected and possibly dangerous situations. In this paper we will present a framework that allows us to measure the robustness of systems. This is the ability of a program to tolerate changes in the environmental conditions and preserving the original behaviour. In the proposed framework, the interaction of a program with its environment is represented as a sequence of random variables describing how both evolve in time. For this reason, the considered measures will be defined among probability distributions of observed data. The proposed framework will be then used to define the notions of adaptability and reliability. The former indicates the ability of a program to absorb perturbation on environmental conditions after a given amount of time. The latter expresses the ability of a program to maintain its intended behaviour (up-to some reasonable tolerance) despite the presence of perturbations in the environment. Moreover, an algorithm, based on statistical inference, it proposed to evaluate the proposed metric and the aforementioned properties. Throughout the paper, two case studies are used to the describe and evaluate the proposed approach.

Atom（文本編輯器） · 塊 · MoDELS · SPIN · CASE ·

2021 年 11 月 30 日

TaDA Live: Compositional Reasoning for Termination of Fine-grained Concurrent Programs

Emanuele D'Osualdo,Azadeh Farzan,Philippa Gardner,Julian Sutherland

from arxiv, 84 pages, 131 pages including appendix

We present TaDA Live, a concurrent separation logic for reasoning compositionally about the termination of blocking fine-grained concurrent programs. The crucial challenge is how to deal with abstract atomic blocking: that is, abstract atomic operations that have blocking behaviour arising from busy-waiting patterns as found in, for example, fine-grained spin locks. Our fundamental innovation is with the design of abstract specifications that capture this blocking behaviour as liveness assumptions on the environment. We design a logic that can reason about the termination of clients which use such operations without breaking their abstraction boundaries, and the correctness of the implementations of the operations with respect to their abstract specifications. We introduce a novel semantic model using layered subjective obligations to express liveness invariants, and a proof system that is sound with respect to the model. The subtlety of our specifications and reasoning is illustrated using several case studies.

分離的 · 相互獨立的 · 哈希學習 · 泛化理論 · 分桶 ·

2021 年 11 月 29 日

A Separation Logic for Negative Dependence

Jialu Bao,Marco Gaboardi,Justin Hsu,Joseph Tassarotti

from arxiv, 61 pages, 9 figures, to appear in Proceedings of the ACM on Programming Languages (POPL 2022)

Formal reasoning about hashing-based probabilistic data structures often requires reasoning about random variables where when one variable gets larger (such as the number of elements hashed into one bucket), the others tend to be smaller (like the number of elements hashed into the other buckets). This is an example of negative dependence, a generalization of probabilistic independence that has recently found interesting applications in algorithm design and machine learning. Despite the usefulness of negative dependence for the analyses of probabilistic data structures, existing verification methods cannot establish this property for randomized programs. To fill this gap, we design LINA, a probabilistic separation logic for reasoning about negative dependence. Following recent works on probabilistic separation logic using separating conjunction to reason about the probabilistic independence of random variables, we use separating conjunction to reason about negative dependence. Our assertion logic features two separating conjunctions, one for independence and one for negative dependence. We generalize the logic of bunched implications (BI) to support multiple separating conjunctions, and provide a sound and complete proof system. Notably, the semantics for separating conjunction relies on a non-deterministic, rather than partial, operation for combining resources. By drawing on closure properties for negative dependence, our program logic supports a Frame-like rule for negative dependence and monotone operations. We demonstrate how LINA can verify probabilistic properties of hash-based data structures and balls-into-bins processes.

優化器 · 穩健性 · 線性的 · 約束 · 情景 ·

2021 年 11 月 28 日

Distributionally robust possibilistic optimization problems

Romain Guillaume,Adam Kasperski,Pawel Zielinski

In this paper a class of optimization problems with uncertain linear constraints is discussed. It is assumed that the constraint coefficients are random vectors whose probability distributions are only partially known. Possibility theory is used to model the imprecise probabilities. In one of the interpretations, a possibility distribution (a membership function of a fuzzy set) in the set of coefficient realizations induces a necessity measure, which in turn defines a family of probability distributions in this set. The distributionally robust approach is then used to transform the imprecise constraints into deterministic counterparts. Namely, the uncertain left-had side of each constraint is replaced with the expected value with respect to the worst probability distribution that can occur. It is shown how to represent the resulting problem by using linear or second order cone constraints. This leads to problems which are computationally tractable for a wide class of optimization models, in particular for linear programming.

可約的 · 統計量 · 推斷 ·

2021 年 11 月 27 日

Is Causal Reasoning Harder than Probabilistic Reasoning?

Milan Mossé,Duligur Ibeling,Thomas Icard

Many tasks in statistical and causal inference can be construed as problems of \emph{entailment} in a suitable formal language. We ask whether those problems are more difficult, from a computational perspective, for \emph{causal} probabilistic languages than for pure probabilistic (or "associational") languages. Despite several senses in which causal reasoning is indeed more complex -- both expressively and inferentially -- we show that causal entailment (or satisfiability) problems can be systematically and robustly reduced to purely probabilistic problems. Thus there is no jump in computational complexity. Along the way we answer several open problems concerning the complexity of well known probability logics, in particular demonstrating the $\exists\mathbb{R}$-completeness of a polynomial probability calculus, as well as a seemingly much simpler system, the logic of comparative conditional probability.

Better · MoDELS · RNN · 貝葉斯推斷 · 隱狀態 ·

2021 年 6 月 10 日

RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Soumyasundar Pal,Liheng Ma,Yingxue Zhang,Mark Coates

from arxiv, ICML 2021

Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical models often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.

概率圖模型 · 圖 · 推斷 · GM · 信念傳播 ·

2018 年 5 月 25 日

Inference in Probabilistic Graphical Models by Graph Neural Networks

KiJung Yoon,Renjie Liao,Yuwen Xiong,Lisa Zhang,Ethan Fetaya,Raquel Urtasun,Richard Zemel,Xaq Pitkow

A fundamental computation for statistical inference and accurate decision-making is to compute the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops. Here we use Graph Neural Networks (GNNs) to learn a message-passing algorithm that solves these inference tasks. We first show that the architecture of GNNs is well-matched to inference tasks. We then demonstrate the efficacy of this inference approach by training GNNs on a collection of graphical models and showing that they substantially outperform belief propagation on loopy graphs. Our message-passing algorithms generalize out of the training set to larger graphs and graphs with different structure.

INTERACT · 圖 · MoDELS · 估計/估計量 · state-of-the-art ·

2018 年 3 月 29 日

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Yuanlu Xu,Lei Qin,Xiaobai Liu,Jianwen Xie,Song-Chun Zhu

from arxiv, accepted by CVPR 2018

Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.