国产特级黄色片A级无毛视频,美女被狂C到高潮视频网站18,国产美女精品一区网站在线播放

Making predictions and quantifying their uncertainty when the input data is sequential is a fundamental learning challenge, recently attracting increasing attention. We develop SigGPDE, a new scalable sparse variational inference framework for Gaussian Processes (GPs) on sequential data. Our contribution is twofold. First, we construct inducing variables underpinning the sparse approximation so that the resulting evidence lower bound (ELBO) does not require any matrix inversion. Second, we show that the gradients of the GP signature kernel are solutions of a hyperbolic partial differential equation (PDE). This theoretical insight allows us to build an efficient back-propagation algorithm to optimize the ELBO. We showcase the significant computational gains of SigGPDE compared to existing methods, while achieving state-of-the-art performance for classification tasks on large datasets of up to 1 million multivariate time series.

相關內容

稀疏

關注 1

優化器 · 遷移學習 · Processing（編程語言） · 閉式 · MoDELS ·

2021 年 11 月 22 日

Transfer Learning with Gaussian Processes for Bayesian Optimization

Petru Tighineanu,Kathrin Skubch,Paul Baireuther,Attila Reiss,Felix Berkenkamp,Julia Vinogradska

Bayesian optimization is a powerful paradigm to optimize black-box functions based on scarce and noisy data. Its data efficiency can be further improved by transfer learning from related tasks. While recent transfer models meta-learn a prior based on large amount of data, in the low-data regime methods that exploit the closed-form posterior of Gaussian processes (GPs) have an advantage. In this setting, several analytically tractable transfer-model posteriors have been proposed, but the relative advantages of these methods are not well understood. In this paper, we provide a unified view on hierarchical GP models for transfer learning, which allows us to analyze the relationship between methods. As part of the analysis, we develop a novel closed-form boosted GP transfer model that fits between existing approaches in terms of complexity. We evaluate the performance of the different approaches in large-scale experiments and highlight strengths and weaknesses of the different transfer-learning methods.

估計/估計量 · Processing（編程語言） · 線性的 · 分層采樣 · 核化 ·

2021 年 11 月 20 日

Gradient-based estimation of linear Hawkes processes with general kernels

álvaro Cartea,Samuel N. Cohen,Saad Labyad

from arxiv, 51 pages, 17 figures, 1 table

Linear multivariate Hawkes processes (MHP) are a fundamental class of point processes with self-excitation. When estimating parameters for these processes, a difficulty is that the two main error functionals, the log-likelihood and the least squares error (LSE), as well as the evaluation of their gradients, have a quadratic complexity in the number of observed events. In practice, this prohibits the use of exact gradient-based algorithms for parameter estimation. We construct an adaptive stratified sampling estimator of the gradient of the LSE. This results in a fast parametric estimation method for MHP with general kernels, applicable to large datasets, which compares favourably with existing methods.

Processing（編程語言） · 馬爾可夫過程 · 推斷 · CASES · 蒙特卡羅 ·

2021 年 11 月 19 日

Markov Genealogy Processes

Aaron A. King,Qianying Lin,Edward L. Ionides

We construct a family of genealogy-valued Markov processes that are induced by a continuous-time Markov population process. We derive exact expressions for the likelihood of a given genealogy conditional on the history of the underlying population process. These lead to a nonlinear filtering equation which can be used to design efficient Monte Carlo inference algorithms. We demonstrate these calculations with several examples. Existing full-information approaches for phylodynamic inference are special cases of the theory.

核化 · 點估計 · 核函數 · 估計/估計量 · Processing（編程語言） ·

2021 年 11 月 19 日

Marginalised Gaussian Processes with Nested Sampling

Fergus Simpson,Vidhi Lalchand,Carl Edward Rasmussen

from arxiv, To appear in Neural Information Processing Systems (NeurIPS) 2021

Gaussian Process (GPs) models are a rich distribution over functions with inductive biases controlled by a kernel function. Learning occurs through the optimisation of kernel hyperparameters using the marginal likelihood as the objective. This classical approach known as Type-II maximum likelihood (ML-II) yields point estimates of the hyperparameters, and continues to be the default method for training GPs. However, this approach risks underestimating predictive uncertainty and is prone to overfitting especially when there are many hyperparameters. Furthermore, gradient based optimisation makes ML-II point estimates highly susceptible to the presence of local minima. This work presents an alternative learning procedure where the hyperparameters of the kernel function are marginalised using Nested Sampling (NS), a technique that is well suited to sample from complex, multi-modal distributions. We focus on regression tasks with the spectral mixture (SM) class of kernels and find that a principled approach to quantifying model uncertainty leads to substantial gains in predictive performance across a range of synthetic and benchmark data sets. In this context, nested sampling is also found to offer a speed advantage over Hamiltonian Monte Carlo (HMC), widely considered to be the gold-standard in MCMC based inference.

PCA · MoDELS · 有向 · 統計量 · 矩陣論 ·

2021 年 11 月 19 日

Gaussian Determinantal Processes: a new model for directionality in data

Subhro Ghosh,Philippe Rigollet

from arxiv, Published in the Proceedings of the National Academy of Sciences (Direct Submission)

Determinantal point processes (a.k.a. DPPs) have recently become popular tools for modeling the phenomenon of negative dependence, or repulsion, in data. However, our understanding of an analogue of a classical parametric statistical theory is rather limited for this class of models. In this work, we investigate a parametric family of Gaussian DPPs with a clearly interpretable effect of parametric modulation on the observed points. We show that parameter modulation impacts the observed points by introducing directionality in their repulsion structure, and the principal directions correspond to the directions of maximal (i.e. the most long ranged) dependency. This model readily yields a novel and viable alternative to Principal Component Analysis (PCA) as a dimension reduction tool that favors directions along which the data is most spread out. This methodological contribution is complemented by a statistical analysis of a spiked model similar to that employed for covariance matrices as a framework to study PCA. These theoretical investigations unveil intriguing questions for further examination in random matrix theory, stochastic geometry and related topics.

MoDELS · Networking · 塊 · 簇 · Performer ·

2021 年 11 月 18 日

Consistency of Spectral Clustering on Hierarchical Stochastic Block Models

Lihua Lei,Xiaodong Li,Xingmei Lou

from arxiv, 45 pages, 7 figures

We study the hierarchy of communities in real-world networks under a generic stochastic block model, in which the connection probabilities are structured in a binary tree. Under such model, a standard recursive bi-partitioning algorithm is dividing the network into two communities based on the Fiedler vector of the unnormalized graph Laplacian and repeating the split until a stopping rule indicates no further community structures. We prove the strong consistency of this method under a wide range of model parameters, which include sparse networks with node degrees as small as $O(\log n)$. In addition, unlike most of existing work, our theory covers multiscale networks where the connection probabilities may differ by orders of magnitude, which comprise an important class of models that are practically relevant but technically challenging to deal with. Finally we demonstrate the performance of our algorithm on synthetic data and real-world examples.

條件獨立的 · 相互獨立的 · MoDELS · Processing（編程語言） · 泛函 ·

2021 年 11 月 18 日

Graphical Gaussian Process Models for Highly Multivariate Spatial Data

Debangan Dey,Abhirup Datta,Sudipto Banerjee

For multivariate spatial Gaussian process (GP) models, customary specifications of cross-covariance functions do not exploit relational inter-variable graphs to ensure process-level conditional independence among the variables. This is undesirable, especially for highly multivariate settings, where popular cross-covariance functions such as the multivariate Mat\'ern suffer from a "curse of dimensionality" as the number of parameters and floating point operations scale up in quadratic and cubic order, respectively, in the number of variables. We propose a class of multivariate "Graphical Gaussian Processes" using a general construction called "stitching" that crafts cross-covariance functions from graphs and ensures process-level conditional independence among variables. For the Mat\'ern family of functions, stitching yields a multivariate GP whose univariate components are Mat\'ern GPs, and conforms to process-level conditional independence as specified by the graphical model. For highly multivariate settings and decomposable graphical models, stitching offers massive computational gains and parameter dimension reduction. We demonstrate the utility of the graphical Mat\'ern GP to jointly model highly multivariate spatial data using simulation examples and an application to air-pollution modelling.

稀疏 · 近似 · SimPLe · 確切的 · 線性的 ·

2021 年 11 月 18 日

A sparse approximate inverse for triangular matrices based on Jacobi iteration

Zhongjie Lu

In this paper, we propose a simple sparse approximate inverse for triangular matrices (SAIT). Using the Jacobi iteration method, we obtain an expression of the exact inverse of triangular matrix, which is a finite series. The SAIT is constructed based on this series. We apply the SAIT matrices to iterative methods with ILU preconditioners. The two triangular solvers in the ILU preconditioning procedure are replaced by two matrix-vector multiplications, which can be fine-grained parallelized. We test this method by solving some linear systems and eigenvalue problems with preconditioned iterative methods.

可辨認的 · 核化 · 高斯過程回歸 · MoDELS · 估計/估計量 ·

2021 年 11 月 18 日

Identifiability of Covariance Kernels in the Gaussian Process Regression Model

Jaehoan Kim,Jaeyong Lee

from arxiv, 20 pages, 0 figures

Gaussian process regression (GPR) model is a popular nonparametric regression model. In GPR, features of the regression function such as varying degrees of smoothness and periodicities are modeled through combining various covarinace kernels, which are supposed to model certain effects. The covariance kernels have unknown parameters which are estimated by the EM-algorithm or Markov Chain Monte Carlo. The estimated parameters are keys to the inference of the features of the regression functions, but identifiability of these parameters has not been investigated. In this paper, we prove identifiability of covariance kernel parameters in two radial basis mixed kernel GPR and radial basis and periodic mixed kernel GPR. We also provide some examples about non-identifiable cases in such mixed kernel GPRs.

單純形 · Performer · Processing（編程語言） · 貝葉斯推斷 · 離散化 ·

2018 年 6 月 19 日

Large-Scale Stochastic Sampling from the Probability Simplex

Jack Baker,Paul Fearnhead,Emily B Fox,Christopher Nemeth

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.