亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<li id='o7hlq'></li>

_{^{<dd id='o7hlq'><tbody id='o7hlq'><td id='o7hlq'><optgroup id='o7hlq'><strong id='o7hlq'></strong></optgroup><address id='o7hlq'><ul id='o7hlq'></ul></address><big id='o7hlq'></big></td><table id='o7hlq'></table></tbody><pre id='o7hlq'></pre></dd><span id='o7hlq'><b id='o7hlq'></b></span>}}


<dfn id='o7hlq'><optgroup id='o7hlq'></optgroup></dfn><tfoot id='o7hlq'><bdo id='o7hlq'><div id='o7hlq'></div><i id='o7hlq'><dt id='o7hlq'></dt></i></bdo></tfoot>

_{<fieldset id='o7hlq'></fieldset>}

·

馬爾可夫過程 · Processing（編程語言） · 貝葉斯推斷 · 估計/估計量 · 蒙特卡羅 ·

2021 年 9 月 28 日

Perturbation theory for killed Markov processes and quasi-stationary distributions

Daniel Rudolf,Andi Q. Wang

from arxiv, 34 pages, 3 figures

We investigate the stability of quasi-stationary distributions of killed Markov processes to perturbations of the generator. In the first setting, we consider a general bounded self-adjoint perturbation operator, and after that, study a particular unbounded perturbation corresponding to the truncation of the killing rate. In both scenarios, we quantify the difference between eigenfunctions of the smallest eigenvalue of the perturbed and unperturbed generator in a Hilbert space norm. As a consequence, $\mathcal{L}^1$-norm estimates of the difference of the resulting quasi-stationary distributions in terms of the perturbation are provided. These results are particularly relevant to the recently-proposed class of quasi-stationary Monte Carlo methods, designed for scalable exact Bayesian inference.

相關內容

馬爾可夫過程

馬爾可夫過(guo)程

核化 · 點估計 · 核函數 · 估計/估計量 · Processing（編程語言） ·

2021 年 11 月 19 日

Marginalised Gaussian Processes with Nested Sampling

Fergus Simpson,Vidhi Lalchand,Carl Edward Rasmussen

from arxiv, To appear in Neural Information Processing Systems (NeurIPS) 2021

Gaussian Process (GPs) models are a rich distribution over functions with inductive biases controlled by a kernel function. Learning occurs through the optimisation of kernel hyperparameters using the marginal likelihood as the objective. This classical approach known as Type-II maximum likelihood (ML-II) yields point estimates of the hyperparameters, and continues to be the default method for training GPs. However, this approach risks underestimating predictive uncertainty and is prone to overfitting especially when there are many hyperparameters. Furthermore, gradient based optimisation makes ML-II point estimates highly susceptible to the presence of local minima. This work presents an alternative learning procedure where the hyperparameters of the kernel function are marginalised using Nested Sampling (NS), a technique that is well suited to sample from complex, multi-modal distributions. We focus on regression tasks with the spectral mixture (SM) class of kernels and find that a principled approach to quantifying model uncertainty leads to substantial gains in predictive performance across a range of synthetic and benchmark data sets. In this context, nested sampling is also found to offer a speed advantage over Hamiltonian Monte Carlo (HMC), widely considered to be the gold-standard in MCMC based inference.

MoDELS · Automator · 學成 · 閾值 · 預測準確率 ·

2021 年 11 月 18 日

Differentiable Learning Under Triage

Nastaran Okati,Abir De,Manuel Gomez-Rodriguez

from arxiv, This version fixes a bug in the implementation of the baseline "Surrogate-based triage". Figure 4, the discussion of the results in Section 6, and the description of the baseline "Surrogate-based triage" in Appendix C have been updated

Multiple lines of evidence suggest that predictive models may benefit from algorithmic triage. Under algorithmic triage, a predictive model does not predict all instances but instead defers some of them to human experts. However, the interplay between the prediction accuracy of the model and the human experts under algorithmic triage is not well understood. In this work, we start by formally characterizing under which circumstances a predictive model may benefit from algorithmic triage. In doing so, we also demonstrate that models trained for full automation may be suboptimal under triage. Then, given any model and desired level of triage, we show that the optimal triage policy is a deterministic threshold rule in which triage decisions are derived deterministically by thresholding the difference between the model and human errors on a per-instance level. Building upon these results, we introduce a practical gradient-based algorithm that is guaranteed to find a sequence of triage policies and predictive models of increasing performance. Experiments on a wide variety of supervised learning tasks using synthetic and real data from two important applications -- content moderation and scientific discovery -- illustrate our theoretical results and show that the models and triage policies provided by our gradient-based algorithm outperform those provided by several competitive baselines.

泛函 · 離散化 · 統計量 · 機器學習 · 計算學習理論 ·

2021 年 11 月 18 日

The Fourier Discrepancy Function

Auricchio Gennaro,Codegoni Andrea,Gualandi Stefano,Zambon Lorenzo

from arxiv, 8 pages, 1 figure

In this paper, we propose the Fourier Discrepancy Function, a new discrepancy to compare discrete probability measures. We show that this discrepancy takes into account the geometry of the underlying space. We prove that the Fourier Discrepancy is convex, twice differentiable, and that its gradient has an explicit formula. We also provide a compelling statistical interpretation. Finally, we study the lower and upper tight bounds for the Fourier Discrepancy in terms of the Total Variation distance.

馬爾可夫鏈蒙特卡羅 · 樣例 · 馬爾可夫鏈 · INTERACT · 成對型 ·

2021 年 11 月 17 日

Random Persistence Diagram Generation

Farzana Nasrin,Theodore Papamarkou,Na Gong,Orlando Rios,Vasileios Maroulas

from arxiv, 29 pages, 7 figures and 7 tables

Topological data analysis (TDA) studies the shape patterns of data. Persistent homology (PH) is a widely used method in TDA that summarizes homological features of data at multiple scales and stores them in persistence diagrams (PDs). In this paper, we propose a random persistence diagram generation (RPDG) method that generates a sequence of random PDs from the ones produced by the data. RPDG is underpinned by (i) a model based on pairwise interacting point processes for inference of persistence diagrams, and (ii) by a reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithm for generating samples of PDs. A first example, which is based on a synthetic dataset, demonstrates the efficacy of RPDG and provides a detailed comparison with other existing methods for sampling PDs. A second example demonstrates the utility of RPDG to solve a materials science problem given a real dataset of small sample size.

訓練數據 · 得分 · 推斷 · MoDELS · 過估計 ·

2021 年 11 月 17 日

Do Not Trust Prediction Scores for Membership Inference Attacks

Dominik Hintersdorf,Lukas Struppek,Kristian Kersting

from arxiv, 15 pages, 9 figures, 9 tables

Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model. Knowing this may indeed lead to a privacy breach. Arguably, most MIAs, however, make use of the model's prediction scores - the probability of each output given some input - following the intuition that the trained model tends to behave differently on its training data. We argue that this is a fallacy for many modern deep network architectures, e.g., ReLU type neural networks produce almost always high prediction scores far away from the training data. Consequently, MIAs will miserably fail since this behavior leads to high false-positive rates not only on known domains but also on out-of-distribution data and implicitly acts as a defense against MIAs. Specifically, using generative adversarial networks, we are able to produce a potentially infinite number of samples falsely classified as part of the training data. In other words, the threat of MIAs is overestimated and less information is leaked than previously assumed. Moreover, there is actually a trade-off between the overconfidence of classifiers and their susceptibility to MIAs: the more classifiers know when they do not know, making low confidence predictions far away from the training data, the more they reveal the training data.

近似 · 價值函數 · Processing（編程語言） · Extensibility · 貝爾曼最優方程 ·

2021 年 11 月 16 日

Kernel-based Diffusion Approximated Markov Decision Processes for Off-Road Autonomous Navigation and Control

Junhong Xu,Kai Yin,Zheng Chen,Jason M. Gregory,Ethan A. Stump,Lantao Liu

from arxiv, arXiv admin note: substantial text overlap with arXiv:2006.02008

We propose a diffusion approximation method to the continuous-state Markov Decision Processes (MDPs) that can be utilized to address autonomous navigation and control in unstructured off-road environments. In contrast to most decision-theoretic planning frameworks that assume fully known state transition models, we design a method that eliminates such a strong assumption that is often extremely difficult to engineer in reality. We first take the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of the value function, we then design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We first validate the proposed method through extensive simulations in $2D$ obstacle avoidance and $2.5D$ terrain navigation problems. The results show that the proposed approach leads to a much superior performance over several baselines. We then develop a system that integrates our decision-making framework with onboard perception and conduct real-world experiments in both cluttered indoor and unstructured outdoor environments. The results from the physical systems further demonstrate the applicability of our method in challenging real-world environments.

離散化 · MoDELS · 樣本 · 似然 · 近似 ·

2021 年 6 月 6 日

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

Will Grathwohl,Kevin Swersky,Milad Hashemi,David Duvenaud,Chris J. Maddison

from arxiv, Energy-Based Models, Deep generative models, MCMC sampling

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

穩健性 · Neural Networks · 優化器 · Networking · CIFAR-10 ·

2020 年 12 月 3 日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Tejas Gokhale,Rushil Anirudh,Bhavya Kailkhura,Jayaraman J. Thiagarajan,Chitta Baral,Yezhou Yang

from arxiv, Accepted to AAAI 2021. Preprint

While existing work in robust deep learning has focused on small pixel-level $\ell_p$ norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.

平滑 · 注意力機制 · 反向傳播 · 維特比算法 · 正則化項 ·

2018 年 2 月 20 日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arthur Mensch,Mathieu Blondel

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.

離散化 · 馬爾可夫鏈蒙特卡羅 · 潛在 · 可交換的 · 話題模型 ·

2018 年 1 月 15 日

Latent nested nonparametric priors

Federico Camerlenghi,David B. Dunson,Antonio Lijoi,Igor Prünster,Abel Rodríguez

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

馬爾可夫過程

Processing（編程語言）

貝葉(xie)斯(si)推斷

估(gu)計(ji)/估(gu)計(ji)量

蒙特卡(ka)羅

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='bFk7m'></tfoot>

<legend id='96S6r'><style id='TGIwR'><dir id='2MuX3'><q id='US3Lt'></q></dir></style></legend>

<i id='5YLut'><tr id='AR94o'><dt id='ltyfM'><q id='smkSk'><span id='x1O7Z'><b id='8NHvu'><form id='DjxQL'><ins id='3a3rk'></ins><ul id='dieYY'></ul><sub id='2cSVT'></sub></form><legend id='xaQZy'></legend><bdo id='dkOFw'><pre id='WjMJV'><center id='c7SDT'></center></pre></bdo></b><th id='rt3Ih'></th></span></q></dt></tr></i><div id='dhiRk'><tfoot id='quFHB'></tfoot><dl id='1waMs'><fieldset id='JxA5b'></fieldset></dl></div>

<li id='qRgh3'><abbr id='dm4yw'></abbr></li>