亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Bayesian inference for high-dimensional inverse problems is computationally costly and requires selecting a suitable prior distribution. Amortized variational inference addresses these challenges via a neural network that approximates the posterior distribution not only for one instance of data, but a distribution of data pertaining to a specific inverse problem. During inference, the neural network -- in our case a conditional normalizing flow -- provides posterior samples at virtually no cost. However, the accuracy of amortized variational inference relies on the availability of high-fidelity training data, which seldom exists in geophysical inverse problems due to the Earth's heterogeneity. In addition, the network is prone to errors if evaluated over out-of-distribution data. As such, we propose to increase the resilience of amortized variational inference in the presence of moderate data distribution shifts. We achieve this via a correction to the latent distribution that improves the posterior distribution approximation for the data at hand. The correction involves relaxing the standard Gaussian assumption on the latent distribution and parameterizing it via a Gaussian distribution with an unknown mean and (diagonal) covariance. These unknowns are then estimated by minimizing the Kullback-Leibler divergence between the corrected and the (physics-based) true posterior distributions. While generic and applicable to other inverse problems, by means of a linearized seismic imaging example, we show that our correction step improves the robustness of amortized variational inference with respect to changes in the number of seismic sources, noise variance, and shifts in the prior distribution. This approach provides a seismic image with limited artifacts and an assessment of its uncertainty at approximately the same cost as five reverse-time migrations.

相關內容

In this manuscript, we study the problem of scalar-on-distribution regression; that is, instances where subject-specific distributions or densities, or in practice, repeated measures from those distributions, are the covariates related to a scalar outcome via a regression model. We propose a direct regression for such distribution-valued covariates that circumvents estimating subject-specific densities and directly uses the observed repeated measures as covariates. The model is invariant to any transformation or ordering of the repeated measures. Endowing the regression function with a Gaussian Process prior, we obtain closed form or conjugate Bayesian inference. Our method subsumes the standard Bayesian non-parametric regression using Gaussian Processes as a special case. Theoretically, we show that the method can achieve an optimal estimation error bound. To our knowledge, this is the first theoretical study on Bayesian regression using distribution-valued covariates. Through simulation studies and analysis of activity count dataset, we demonstrate that our method performs better than approaches that require an intermediate density estimation step.

Forward simulation-based uncertainty quantification that studies the output distribution of quantities of interest (QoI) is a crucial component for computationally robust statistics and engineering. There is a large body of literature devoted to accurately assessing statistics of QoI, and in particular, multilevel or multifidelity approaches are known to be effective, leveraging cost-accuracy tradeoffs between a given ensemble of models. However, effective algorithms that can estimate the full distribution of outputs are still under active development. In this paper, we introduce a general multifidelity framework for estimating the cumulative distribution functions (CDFs) of vector-valued QoI associated with a high-fidelity model under a budget constraint. Given a family of appropriate control variates obtained from lower fidelity surrogates, our framework involves identifying the most cost-effective model subset and then using it to build an approximate control variates estimator for the target CDF. We instantiate the framework by constructing a family of control variates using intermediate linear approximators and rigorously analyze the corresponding algorithm. Our analysis reveals that the resulting CDF estimator is uniformly consistent and budget-asymptotically optimal, with only mild moment and regularity assumptions. The approach provides a robust multifidelity CDF estimator that is adaptive to the available budget, does not require \textit{a priori} knowledge of cross-model statistics or model hierarchy, and is applicable to general output dimensions. We demonstrate the efficiency and robustness of the approach using several test examples.

Estimating a Gibbs density function given a sample is an important problem in computational statistics and statistical learning. Although the well established maximum likelihood method is commonly used, it requires the computation of the partition function (i.e., the normalization of the density). This function can be easily calculated for simple low-dimensional problems but its computation is difficult or even intractable for general densities and high-dimensional problems. In this paper we propose an alternative approach based on Maximum A-Posteriori (MAP) estimators, we name Maximum Recovery MAP (MR-MAP), to derive estimators that do not require the computation of the partition function, and reformulate the problem as an optimization problem. We further propose a least-action type potential that allows us to quickly solve the optimization problem as a feed-forward hyperbolic neural network. We demonstrate the effectiveness of our methods on some standard data sets.

The presence of measurement error is a widespread issue which, when ignored, can render the results of an analysis unreliable. Numerous corrections for the effects of measurement error have been proposed and studied, often under the assumption of a normally distributed, additive measurement error model. One such correction is the simulation extrapolation method, which provides a flexible way of correcting for the effects of error in a wide variety of models, when the errors are approximately normally distributed. However, in many situations observed data are non-symmetric, heavy-tailed, or otherwise highly non-normal. In these settings, correction techniques relying on the assumption of normality are undesirable. We propose an extension to the simulation extrapolation method which is nonparametric in the sense that no specific distributional assumptions are required on the error terms. The technique is implemented when either validation data or replicate measurements are available, and it shares the general structure of the standard simulation extrapolation procedure, making it immediately accessible for those familiar with this technique.

As datasets grow larger, they are often distributed across multiple machines that compute in parallel and communicate with a central machine through short messages. In this paper, we focus on sparse regression and propose a new procedure for conducting selective inference with distributed data. Although many distributed procedures exist for point estimation in the sparse setting, few options are available for estimating uncertainties or conducting hypothesis tests based on the estimated sparsity. We solve a generalized linear regression on each machine, which then communicates a selected set of predictors to the central machine. The central machine uses these selected predictors to form a generalized linear model (GLM). To conduct inference in the selected GLM, our proposed procedure bases approximately-valid selective inference on an asymptotic likelihood. The proposal seeks only aggregated information, in relatively few dimensions, from each machine which is merged at the central machine for selective inference. By reusing low-dimensional summary statistics from local machines, our procedure achieves higher power while keeping the communication cost low. This method is also applicable as a solution to the notorious p-value lottery problem that arises when model selection is repeated on random splits of data.

Understanding the properties of the stochastic phase field models is crucial to model processes in several practical applications, such as soft matters and phase separation in random environments. To describe such random evolution, this work proposes and studies two mathematical models and their numerical approximations for parabolic stochastic partial differential equation (SPDE) with a logarithmic Flory--Huggins energy potential. These multiscale models are built based on a regularized energy technique and thus avoid possible singularities of coefficients. According to the large deviation principle, we show that the limit of the proposed models with small noise naturally recovers the classical dynamics in deterministic case. Moreover, when the driving noise is multiplicative, the Stampacchia maximum principle holds which indicates the robustness of the proposed model. One of the main advantages of the proposed models is that they can admit the energy evolution law and asymptotically preserve the Stampacchia maximum bound of the original problem. To numerically capture these asymptotic behaviors, we investigate the semi-implicit discretizations for regularized logrithmic SPDEs. Several numerical results are presented to verify our theoretical findings.

The analysis of large-scale datasets, especially in biomedical contexts, frequently involves a principled screening of multiple hypotheses. The celebrated two-group model jointly models the distribution of the test statistics with mixtures of two competing densities, the null and the alternative distributions. We investigate the use of weighted densities and, in particular, non-local densities as working alternative distributions, to enforce separation from the null and thus refine the screening procedure. We show how these weighted alternatives improve various operating characteristics, such as the Bayesian False Discovery rate, of the resulting tests for a fixed mixture proportion with respect to a local, unweighted likelihood approach. Parametric and nonparametric model specifications are proposed, along with efficient samplers for posterior inference. By means of a simulation study, we exhibit how our model compares with both well-established and state-of-the-art alternatives in terms of various operating characteristics. Finally, to illustrate the versatility of our method, we conduct three differential expression analyses with publicly-available datasets from genomic studies of heterogeneous nature.

We combine Tyler's robust estimator of the dispersion matrix with nonlinear shrinkage. This approach delivers a simple and fast estimator of the dispersion matrix in elliptical models that is robust against both heavy tails and high dimensions. We prove convergence of the iterative part of our algorithm and demonstrate the favorable performance of the estimator in a wide range of simulation scenarios. Finally, an empirical application demonstrates its state-of-the-art performance on real data.

Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

北京阿比特科技有限公司