亚洲精品无码国产爽快A片百度,欧美91精品久久久久影视网

Bayesian phylogenetic inference is often conducted via local or sequential search over topologies and branch lengths using algorithms such as random-walk Markov chain Monte Carlo (MCMC) or Combinatorial Sequential Monte Carlo (CSMC). However, when MCMC is used for evolutionary parameter learning, convergence requires long runs with inefficient exploration of the state space. We introduce Variational Combinatorial Sequential Monte Carlo (VCSMC), a powerful framework that establishes variational sequential search to learn distributions over intricate combinatorial structures. We then develop nested CSMC, an efficient proposal distribution for CSMC and prove that nested CSMC is an exact approximation to the (intractable) locally optimal proposal. We use nested CSMC to define a second objective, VNCSMC which yields tighter lower bounds than VCSMC. We show that VCSMC and VNCSMC are computationally efficient and explore higher probability spaces than existing methods on a range of tasks.

相關內容

蒙特卡羅

關注 1

估計/估計量 · MoDELS · 模型可辨識性 · 推斷 · 可辨認的 ·

2021 年 8 月 18 日

Casual Inference using Deep Bayesian Dynamic Survival Model (CDS)

Jie Zhu,Blanca Gallego

Causal inference in longitudinal observational health data often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-varying covariates. To tackle this sequential treatment effect estimation problem, we have developed a causal dynamic survival (CDS) model that uses the potential outcomes framework with the recurrent sub-networks with random seed ensembles to estimate the difference in survival curves of its confidence interval. Using simulated survival datasets, the CDS model has shown good causal effect estimation performance across scenarios of sample dimension, event rate, confounding and overlapping. However, increasing the sample size is not effective to alleviate the adverse impact from high level of confounding. In two large clinical cohort studies, our model identified the expected conditional average treatment effect and detected individual effect heterogeneity over time and patient subgroups. CDS provides individualised absolute treatment effect estimations to improve clinical decisions.

TOOLS · 泛函 · Extensibility · 均值 · 樣本均值 ·

2021 年 8 月 18 日

Amplitude Mean of Functional Data on $\mathbb{S}^2$

Zhengwu Zhang,Bayan Saparbayeva

Manifold-valued functional data analysis (FDA) recently becomes an active area of research motivated by the raising availability of trajectories or longitudinal data observed on non-linear manifolds. The challenges of analyzing such data come from many aspects, including infinite dimensionality and nonlinearity, as well as time-domain or phase variability. In this paper, we study the amplitude part of manifold-valued functions on $\mathbb{S}^2$, which is invariant to random time warping or re-parameterization. Utilizing the nice geometry of $\mathbb{S}^2$, we develop a set of efficient and accurate tools for temporal alignment of functions, geodesic computing, and sample mean calculation. At the heart of these tools, they rely on gradient descent algorithms with carefully derived gradients. We show the advantages of these newly developed tools over its competitors with extensive simulations and real data and demonstrate the importance of considering the amplitude part of functions instead of mixing it with phase variability in manifold-valued FDA.

Conformer · Neural Networks · Networking · 前向 · 樣例 ·

2021 年 8 月 17 日

Deep neural network methods for solving forward and inverse problems of time fractional diffusion equations with conformable derivative

Yinlin Ye,Yajing Li,Hongtao Fan,Xinyi Liu,Hongbing Zhang

Physics-informed neural networks (PINNs) show great advantages in solving partial differential equations. In this paper, we for the first time propose to study conformable time fractional diffusion equations by using PINNs. By solving the supervise learning task, we design a new spatio-temporal function approximator with high data efficiency. L-BFGS algorithm is used to optimize our loss function, and back propagation algorithm is used to update our parameters to give our numerical solutions. For the forward problem, we can take IC/BCs as the data, and use PINN to solve the corresponding partial differential equation. Three numerical examples are are carried out to demonstrate the effectiveness of our methods. In particular, when the order of the conformable fractional derivative $\alpha$ tends to $1$, a class of weighted PINNs is introduced to overcome the accuracy degradation caused by the singularity of solutions. For the inverse problem, we use the data obtained to train the neural network, and the estimation of parameter $\lambda$ in the equation is elaborated. Similarly, we give three numerical examples to show that our method can accurately identify the parameters, even if the training data is corrupted with 1\% uncorrelated noise.

線性的 · Weight · 近似 · 分離的 ·

2021 年 8 月 15 日

Approximate MDS Property of Linear Codes

Ghurumuruhan Ganesan

from arxiv, Accepted for publication in European Conference on Combinatorics, Graph Theory and Applications (EUROCOMB 2021)

In this paper, we study the weight spectrum of linear codes with \emph{super-linear} field size and use the probabilistic method to show that for nearly all such codes, the corresponding weight spectrum is very close to that of a maximum distance separable (MDS) code.

Neural Networks · Networking · 可約的 · Continuity · 推斷 ·

2021 年 6 月 21 日

A Survey of Quantization Methods for Efficient Neural Network Inference

Amir Gholami,Sehoon Kim,Zhen Dong,Zhewei Yao,Michael W. Mahoney,Kurt Keutzer

from arxiv, Book Chapter: Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

優化器 · 獎勵函數 · 穩健性 · 學成 · 泛函 ·

2021 年 6 月 11 日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Zaynah Javed,Daniel S. Brown,Satvik Sharma,Jerry Zhu,Ashwin Balakrishna,Marek Petrik,Anca D. Dragan,Ken Goldberg

from arxiv, In proceedings International Conference on Machine Learning (ICML) 2021

The difficulty in specifying rewards for many real-world problems has led to an increased focus on learning rewards from human feedback, such as demonstrations. However, there are often many different reward functions that explain the human feedback, leaving agents with uncertainty over what the true reward function is. While most policy optimization approaches handle this uncertainty by optimizing for expected performance, many applications demand risk-averse behavior. We derive a novel policy gradient-style robust optimization approach, PG-BROIL, that optimizes a soft-robust objective that balances expected performance and risk. To the best of our knowledge, PG-BROIL is the first policy optimization algorithm robust to a distribution of reward hypotheses which can scale to continuous MDPs. Results suggest that PG-BROIL can produce a family of behaviors ranging from risk-neutral to risk-averse and outperforms state-of-the-art imitation learning algorithms when learning from ambiguous demonstrations by hedging against uncertainty, rather than seeking to uniquely identify the demonstrator's reward function.

優化器 · 強化學習 · 學成 · state-of-the-art · SimPLe ·

2018 年 7 月 25 日

Variational Bayesian Reinforcement Learning with Regret Bounds

Brendan O'Donoghue

We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.

PAM · 推斷 · 向量空間 · 有向非循環圖 · 話題模型 ·

2018 年 4 月 21 日

Variational Inference In Pachinko Allocation Machines

Akash Srivastava,Charles Sutton

The Pachinko Allocation Machine (PAM) is a deep topic model that allows representing rich correlation structures among topics by a directed acyclic graph over topics. Because of the flexibility of the model, however, approximate inference is very difficult. Perhaps for this reason, only a small number of potential PAM architectures have been explored in the literature. In this paper we present an efficient and flexible amortized variational inference method for PAM, using a deep inference network to parameterize the approximate posterior distribution in a manner similar to the variational autoencoder. Our inference method produces more coherent topics than state-of-art inference methods for PAM while being an order of magnitude faster, which allows exploration of a wider range of PAM architectures than have previously been studied.

向量空間 · 推斷 · 估計/估計量 · Networking · 學習器 ·

2018 年 2 月 27 日

ADMM-based Networked Stochastic Variational Inference

Hamza Anwar,Quanyan Zhu

from arxiv, to be submitted for publishing

Owing to the recent advances in "Big Data" modeling and prediction tasks, variational Bayesian estimation has gained popularity due to their ability to provide exact solutions to approximate posteriors. One key technique for approximate inference is stochastic variational inference (SVI). SVI poses variational inference as a stochastic optimization problem and solves it iteratively using noisy gradient estimates. It aims to handle massive data for predictive and classification tasks by applying complex Bayesian models that have observed as well as latent variables. This paper aims to decentralize it allowing parallel computation, secure learning and robustness benefits. We use Alternating Direction Method of Multipliers in a top-down setting to develop a distributed SVI algorithm such that independent learners running inference algorithms only require sharing the estimated model parameters instead of their private datasets. Our work extends the distributed SVI-ADMM algorithm that we first propose, to an ADMM-based networked SVI algorithm in which not only are the learners working distributively but they share information according to rules of a graph by which they form a network. This kind of work lies under the umbrella of `deep learning over networks' and we verify our algorithm for a topic-modeling problem for corpus of Wikipedia articles. We illustrate the results on latent Dirichlet allocation (LDA) topic model in large document classification, compare performance with the centralized algorithm, and use numerical experiments to corroborate the analytical results.

平滑 · 注意力機制 · 反向傳播 · 維特比算法 · 正則化項 ·

2018 年 2 月 20 日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arthur Mensch,Mathieu Blondel

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.