爱琴海论坛视频播放三免费,动漫AV观看网站不卡无码,亚洲国产综合欧美在线一区二区

In inverse problems, the parameters of a model are estimated based on observations of the model response. The Bayesian approach is powerful for solving such problems; one formulates a prior distribution for the parameter state that is updated with the observations to compute the posterior parameter distribution. Solving for the posterior distribution can be challenging when, e.g., prior and posterior significantly differ from one another and/or the parameter space is high-dimensional. We use a sequence of importance sampling measures that arise by tempering the likelihood to approach inverse problems exhibiting a significant distance between prior and posterior. Each importance sampling measure is identified by cross-entropy minimization as proposed in the context of Bayesian inverse problems in Engel et al. (2021). To efficiently address problems with high-dimensional parameter spaces we set up the minimization procedure in a low-dimensional subspace of the original parameter space. The principal idea is to analyse the spectrum of the second-moment matrix of the gradient of the log-likelihood function to identify a suitable subspace. Following Zahm et al. (2021), an upper bound on the Kullback-Leibler-divergence between full-dimensional and subspace posterior is provided, which can be utilized to determine the effective dimension of the inverse problem corresponding to a prescribed approximation error bound. We suggest heuristic criteria for optimally selecting the number of model and model gradient evaluations in each iteration of the importance sampling sequence. We investigate the performance of this approach using examples from engineering mechanics set in various parameter space dimensions.

相關內容

Subspace

關注 0

tuning · 控制器 · 優化器 · MoDELS · 潛在成因 ·

2022 年 7 月 22 日

On Controller Tuning with Time-Varying Bayesian Optimization

Paul Brunzema,Alexander von Rohr,Sebastian Trimpe

from arxiv, To appear in the proceedings of the 61st IEEE Conference on Decision and Control

Changing conditions or environments can cause system dynamics to vary over time. To ensure optimal control performance, controllers should adapt to these changes. When the underlying cause and time of change is unknown, we need to rely on online data for this adaptation. In this paper, we will use time-varying Bayesian optimization (TVBO) to tune controllers online in changing environments using appropriate prior knowledge on the control objective and its changes. Two properties are characteristic of many online controller tuning problems: First, they exhibit incremental and lasting changes in the objective due to changes to the system dynamics, e.g., through wear and tear. Second, the optimization problem is convex in the tuning parameters. Current TVBO methods do not explicitly account for these properties, resulting in poor tuning performance and many unstable controllers through over-exploration of the parameter space. We propose a novel TVBO forgetting strategy using Uncertainty-Injection (UI), which incorporates the assumption of incremental and lasting changes. The control objective is modeled as a spatio-temporal Gaussian process (GP) with UI through a Wiener process in the temporal domain. Further, we explicitly model the convexity assumptions in the spatial dimension through GP models with linear inequality constraints. In numerical experiments, we show that our model outperforms the state-of-the-art method in TVBO, exhibiting reduced regret and fewer unstable parameter configurations.

IP · 近似推斷 · 近似 · 推斷 · NeurIPS 2019 ·

2022 年 7 月 21 日

Function-space Inference with Sparse Implicit Processes

Simón Rodríguez Santana,Bryan Zaldivar,Daniel Hernández-Lobato

from arxiv, Published at ICML 2022 (long oral presentation). Code available at //github.com/simonrsantana/sparse-implicit-processes

Implicit Processes (IPs) represent a flexible framework that can be used to describe a wide variety of models, from Bayesian neural networks, neural samplers and data generators to many others. IPs also allow for approximate inference in function-space. This change of formulation solves intrinsic degenerate problems of parameter-space approximate inference concerning the high number of parameters and their strong dependencies in large models. For this, previous works in the literature have attempted to employ IPs both to set up the prior and to approximate the resulting posterior. However, this has proven to be a challenging task. Existing methods that can tune the prior IP result in a Gaussian predictive distribution, which fails to capture important data patterns. By contrast, methods producing flexible predictive distributions by using another IP to approximate the posterior process cannot tune the prior IP to the observed data. We propose here the first method that can accomplish both goals. For this, we rely on an inducing-point representation of the prior IP, as often done in the context of sparse Gaussian processes. The result is a scalable method for approximate inference with IPs that can tune the prior IP parameters to the data, and that provides accurate non-Gaussian predictive distributions.

協方差矩陣 · 平穩的 · 可約的 · 優化器 · 相互獨立的 ·

2022 年 7 月 21 日

Cleaning the covariance matrix of strongly nonstationary systems with time-independent eigenvalues

Christian Bongiorno,Damien Challet,Grégoire Loeper

We propose a data-driven way to reduce the noise of covariance matrices of nonstationary systems. In the case of stationary systems, asymptotic approaches were proved to converge to the optimal solutions. Such methods produce eigenvalues that are highly dependent on the inputs, as common sense would suggest. Our approach proposes instead to use a set of eigenvalues totally independent from the inputs and that encode the long-term averaging of the influence of the future on present eigenvalues. Such an influence can be the predominant factor in nonstationary systems. Using real and synthetic data, we show that our data-driven method outperforms optimal methods designed for stationary systems for the filtering of both covariance matrix and its inverse, as illustrated by financial portfolio variance minimization, which makes out method generically relevant to many problems of multivariate inference.

MoDELS · 圖 · 歐氏空間 · INTERACT · 估計/估計量 ·

2022 年 7 月 21 日

A nonstationary spatial covariance model for data on graphs

Michael F. Christensen,Peter D. Hoff

from arxiv, 45 pages, 8 figures

Spatial data can exhibit dependence structures more complicated than can be represented using models that rely on the traditional assumptions of stationarity and isotropy. Several statistical methods have been developed to relax these assumptions. One in particular, the "spatial deformation approach" defines a transformation from the geographic space in which data are observed, to a latent space in which stationarity and isotropy are assumed to hold. Taking inspiration from this class of models, we develop a new model for spatially dependent data observed on graphs. Our method implies an embedding of the graph into Euclidean space wherein the covariance can be modeled using traditional covariance functions such as those from the Mat\'{e}rn family. This is done via a class of graph metrics compatible with such covariance functions. By estimating the edge weights which underlie these metrics, we can recover the "intrinsic distance" between nodes of a graph. We compare our model to existing methods for spatially dependent graph data, primarily conditional autoregressive (CAR) models and their variants and illustrate the advantages our approach has over traditional methods. We fit our model and competitors to bird abundance data for several species in North Carolina. We find that our model fits the data best, and provides insight into the interaction between species-specific spatial distributions and geography.

Boosting（一種模型訓練加速方式） · 正交 · 設計矩陣 · 貪心逐層預訓練 · Projection ·

2022 年 7 月 21 日

High-Dimensional $L_2$Boosting: Rate of Convergence

Ye Luo,Martin Spindler,Jannis Kück

from arxiv, 19 pages, 4 tables; AMS 2000 subject classifications: Primary 62J05, 62J07, 41A25; secondary 49M15, 68Q32

Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of $L_2$Boosting, which is tailored for regression, in a high-dimensional setting. Moreover, we introduce so-called \textquotedblleft post-Boosting\textquotedblright. This is a post-selection estimator which applies ordinary least squares to the variables selected in the first stage by $L_2$Boosting. Another variant is \textquotedblleft Orthogonal Boosting\textquotedblright\ where after each step an orthogonal projection is conducted. We show that both post-$L_2$Boosting and the orthogonal boosting achieve the same rate of convergence as LASSO in a sparse, high-dimensional setting. We show that the rate of convergence of the classical $L_2$Boosting depends on the design matrix described by a sparse eigenvalue constant. To show the latter results, we derive new approximation results for the pure greedy algorithm, based on analyzing the revisiting behavior of $L_2$Boosting. We also introduce feasible rules for early stopping, which can be easily implemented and used in applied work. Our results also allow a direct comparison between LASSO and boosting which has been missing from the literature. Finally, we present simulation studies and applications to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, post-$L_2$Boosting clearly outperforms LASSO.

參數空間 · Learning · Machine Learning · 樣例 · Analysis ·

2022 年 7 月 20 日

Exploration of Parameter Spaces Assisted by Machine Learning

A. Hammad,Myeonghun Park,Raymundo Ramos,Pankaj Saha

from arxiv, 15 pages, 5 figures. Code and instructions are available on //github.com/AHamamd150/MLscanner

We showcase a variety of functions and classes that implement sampling procedures with improved exploration of the parameter space assisted by machine learning. Special attention is paid to setting sane defaults with the objective that adjustments required by different problems remain minimal. This collection of routines can be employed for different types of analysis, from finding bounds on the parameter space to accumulating samples in areas of interest. In particular, we discuss two methods assisted by incorporating different machine learning models: regression and classification. We show that a machine learning classifier can provide higher efficiency for exploring the parameter space. Also, we introduce a boosting technique to improve the slow convergence at the start of the process. The use of these routines is better explained with the help of a few examples that illustrate the type of results one can obtain. We also include examples of the code used to obtain the examples as well as descriptions of the adjustments that can be made to adapt the calculation to other problems. We finalize by showing the impact of these techniques when exploring the parameter space of the two Higgs doublet model that matches the measured Higgs Boson signal strength. The code used for this paper and instructions on how to use it are available on the web.

重要性采樣 · 蒙特卡羅 · 高斯分布 · 樣本 · 錯誤率 ·

2022 年 7 月 20 日

On the error rate of importance sampling with randomized quasi-Monte Carlo

Zhijian He,Zhan Zheng,Xiaoqun Wang

Importance sampling (IS) is valuable in reducing the variance of Monte Carlo sampling for many areas, including finance, rare event simulation, and Bayesian inference. It is natural and obvious to combine quasi-Monte Carlo (QMC) methods with IS to achieve a faster rate of convergence. However, a naive replacement of Monte Carlo with QMC may not work well. This paper investigates the convergence rates of randomized QMC-based IS for estimating integrals with respect to a Gaussian measure, in which the IS measure is a Gaussian or $t$ distribution. We prove that if the target function satisfies the so-called boundary growth condition and the covariance matrix of the IS density has eigenvalues no smaller than 1, then randomized QMC with the Gaussian proposal has a root mean squared error of $O(N^{-1+\epsilon})$ for arbitrarily small $\epsilon>0$. Similar results of $t$ distribution as the proposal are also established. These sufficient conditions help to assess the effectiveness of IS in QMC. For some particular applications, we find that the Laplace IS, a very general approach to approximate the target function by a quadratic Taylor approximation around its mode, has eigenvalues smaller than 1, making the resulting integrand less favorable for QMC. From this point of view, when using Gaussian distributions as the IS proposal, a change of measure via Laplace IS may transform a favorable integrand into unfavorable one for QMC although the variance of Monte Carlo sampling is reduced. We also give some examples to verify our propositions and warn against naive replacement of MC with QMC under IS proposals. Numerical results suggest that using Laplace IS with $t$ distributions is more robust than that with Gaussian distributions.

穩健性 · 對數幾率回歸 · 稀疏 · 解碼 · Performer ·

2022 年 7 月 20 日

Correntropy-Based Logistic Regression with Automatic Relevance Determination for Robust Sparse Brain Activity Decoding

Yuanhao Li,Badong Chen,Yuxi Shi,Natsue Yoshimura,Yasuharu Koike

Recent studies have utilized sparse classifications to predict categorical variables from high-dimensional brain activity signals to expose human's intentions and mental states, selecting the relevant features automatically in the model training process. However, existing sparse classification models will likely be prone to the performance degradation which is caused by noise inherent in the brain recordings. To address this issue, we aim to propose a new robust and sparse classification algorithm in this study. To this end, we introduce the correntropy learning framework into the automatic relevance determination based sparse classification model, proposing a new correntropy-based robust sparse logistic regression algorithm. To demonstrate the superior brain activity decoding performance of the proposed algorithm, we evaluate it on a synthetic dataset, an electroencephalogram (EEG) dataset, and a functional magnetic resonance imaging (fMRI) dataset. The extensive experimental results confirm that not only the proposed method can achieve higher classification accuracy in a noisy and high-dimensional classification task, but also it would select those more informative features for the decoding scenarios. Integrating the correntropy learning approach with the automatic relevance determination technique will significantly improve the robustness with respect to the noise, leading to more adequate robust sparse brain decoding algorithm. It provides a more powerful approach in the real-world brain activity decoding and the brain-computer interfaces.

Performer · 推斷 · 估計/估計量 · Extensibility · 經驗分布 ·

2022 年 7 月 20 日

Validating Causal Inference Methods

Harsh Parikh,Carlos Varjao,Louise Xu,Eric Tchetgen Tchetgen

from arxiv, 5 figures, 13 pages

The fundamental challenge of drawing causal inference is that counterfactual outcomes are not fully observed for any unit. Furthermore, in observational studies, treatment assignment is likely to be confounded. Many statistical methods have emerged for causal inference under unconfoundedness conditions given pre-treatment covariates, including propensity score-based methods, prognostic score-based methods, and doubly robust methods. Unfortunately for applied researchers, there is no `one-size-fits-all' causal method that can perform optimally universally. In practice, causal methods are primarily evaluated quantitatively on handcrafted simulated data. Such data-generative procedures can be of limited value because they are typically stylized models of reality. They are simplified for tractability and lack the complexities of real-world data. For applied researchers, it is critical to understand how well a method performs for the data at hand. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods. The framework's novelty stems from its ability to generate synthetic data anchored at the empirical distribution for the observed sample, and therefore virtually indistinguishable from the latter. The approach allows the user to specify ground truth for the form and magnitude of causal effects and confounding bias as functions of covariates. Thus simulated data sets are used to evaluate the potential performance of various causal estimation methods when applied to data similar to the observed sample. We demonstrate Credence's ability to accurately assess the relative performance of causal estimation techniques in an extensive simulation study and two real-world data applications from Lalonde and Project STAR studies.

MoDELS · Performer · Processing（編程語言） · 學成 · 穩健性 ·

2021 年 9 月 3 日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Paul Michel

from arxiv, PhD thesis

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.