国产精品亚洲综合久久,91 一区二区三区四季,精品无码久久久久久,日本精品一区二区五区在线观看

Markov chain Monte Carlo (MCMC) allows one to generate dependent replicates from a posterior distribution for effectively any Bayesian hierarchical model. However, MCMC can produce a significant computational burden. This motivates us to consider finding expressions of the posterior distribution that are computationally straightforward to obtain independent replicates from directly. We focus on a broad class of Bayesian latent Gaussian process (LGP) models that allow for spatially dependent data. First, we derive a new class of distributions we refer to as the generalized conjugate multivariate (GCM) distribution. The GCM distribution's theoretical development is similar to that of the CM distribution with two main differences; namely, (1) the GCM allows for latent Gaussian process assumptions, and (2) the GCM explicitly accounts for hyperparameters through marginalization. The development of GCM is needed to obtain independent replicates directly from the exact posterior distribution, which has an efficient projection/regression form. Hence, we refer to our method as Exact Posterior Regression (EPR). Illustrative examples are provided including simulation studies for weakly stationary spatial processes and spatial basis function expansions. An additional analysis of poverty incidence data from the U.S. Census Bureau's American Community Survey (ACS) using a conditional autoregressive model is presented.

相關內容

相互獨立的

關注 1

中位數 · 估計/估計量 · 均值 · 穩健性 · 統計量 ·

2023 年 7 月 6 日

The Geometric Median and Applications to Robust Mean Estimation

Stanislav Minsker,Nate Strawn

from arxiv, 28 pages, 2 figures

This paper is devoted to the statistical and numerical properties of the geometric median, and its applications to the problem of robust mean estimation via the median of means principle. Our main theoretical results include (a) an upper bound for the distance between the mean and the median for general absolutely continuous distributions in R^d, and examples of specific classes of distributions for which these bounds do not depend on the ambient dimension $d$; (b) exponential deviation inequalities for the distance between the sample and the population versions of the geometric median, which again depend only on the trace-type quantities and not on the ambient dimension. As a corollary, we deduce improved bounds for the (geometric) median of means estimator that hold for large classes of heavy-tailed distributions. Finally, we address the error of numerical approximation, which is an important practical aspect of any statistical estimation procedure. We demonstrate that the objective function minimized by the geometric median satisfies a "local quadratic growth" condition that allows one to translate suboptimality bounds for the objective function to the corresponding bounds for the numerical approximation to the median itself. As a corollary, we propose a simple stopping rule (applicable to any optimization method) which yields explicit error guarantees. We conclude with the numerical experiments including the application to estimation of mean values of log-returns for S&P 500 data.

穩健性 · 潛在 · 自編碼器 · 變分自編碼 · 可辨認的 ·

2023 年 7 月 5 日

On the Adversarial Robustness of Generative Autoencoders in the Latent Space

Mingfei Lu,Badong Chen

from arxiv, 18 pages, 12 figures

The generative autoencoders, such as the variational autoencoders or the adversarial autoencoders, have achieved great success in lots of real-world applications, including image generation, and signal communication. However, little concern has been devoted to their robustness during practical deployment. Due to the probabilistic latent structure, variational autoencoders (VAEs) may confront problems such as a mismatch between the posterior distribution of the latent and real data manifold, or discontinuity in the posterior distribution of the latent. This leaves a back door for malicious attackers to collapse VAEs from the latent space, especially in scenarios where the encoder and decoder are used separately, such as communication and compressed sensing. In this work, we provide the first study on the adversarial robustness of generative autoencoders in the latent space. Specifically, we empirically demonstrate the latent vulnerability of popular generative autoencoders through attacks in the latent space. We also evaluate the difference between variational autoencoders and their deterministic variants and observe that the latter performs better in latent robustness. Meanwhile, we identify a potential trade-off between the adversarial robustness and the degree of the disentanglement of the latent codes. Additionally, we also verify the feasibility of improvement for the latent robustness of VAEs through adversarial training. In summary, we suggest concerning the adversarial latent robustness of the generative autoencoders, analyze several robustness-relative issues, and give some insights into a series of key challenges.

GANs · MoDELS · 似然 · 樣本 · 確切的 ·

2023 年 7 月 5 日

DiffFlow: A Unified SDE Framework for Score-Based Diffusion Models and Generative Adversarial Networks

Jingwei Zhang,Han Shi,Jincheng Yu,Enze Xie,Zhenguo Li

from arxiv, Tech Report

Generative models can be categorized into two types: explicit generative models that define explicit density forms and allow exact likelihood inference, such as score-based diffusion models (SDMs) and normalizing flows; implicit generative models that directly learn a transformation from the prior to the data distribution, such as generative adversarial nets (GANs). While these two types of models have shown great success, they suffer from respective limitations that hinder them from achieving fast sampling and high sample quality simultaneously. In this paper, we propose a unified theoretic framework for SDMs and GANs. We shown that: i) the learning dynamics of both SDMs and GANs can be described as a novel SDE named Discriminator Denoising Diffusion Flow (DiffFlow) where the drift can be determined by some weighted combinations of scores of the real data and the generated data; ii) By adjusting the relative weights between different score terms, we can obtain a smooth transition between SDMs and GANs while the marginal distribution of the SDE remains invariant to the change of the weights; iii) we prove the asymptotic optimality and maximal likelihood training scheme of the DiffFlow dynamics; iv) under our unified theoretic framework, we introduce several instantiations of the DiffFLow that provide new algorithms beyond GANs and SDMs with exact likelihood inference and have potential to achieve flexible trade-off between high sample quality and fast sampling speed.

操作 · Performer · 近似 · 前向 · MoDELS ·

2023 年 7 月 4 日

Out-of-distributional risk bounds for neural operators with applications to the Helmholtz equation

J. Antonio Lara Benitez,Takashi Furuya,Florian Faucher,Anastasis Kratsios,Xavier Tricoche,Maarten V. de Hoop

Despite their remarkable success in approximating a wide range of operators defined by PDEs, existing neural operators (NOs) do not necessarily perform well for all physics problems. We focus here on high-frequency waves to highlight possible shortcomings. To resolve these, we propose a subfamily of NOs enabling an enhanced empirical approximation of the nonlinear operator mapping wave speed to solution, or boundary values for the Helmholtz equation on a bounded domain. The latter operator is commonly referred to as the ''forward'' operator in the study of inverse problems. Our methodology draws inspiration from transformers and techniques such as stochastic depth. Our experiments reveal certain surprises in the generalization and the relevance of introducing stochastic depth. Our NOs show superior performance as compared with standard NOs, not only for testing within the training distribution but also for out-of-distribution scenarios. To delve into this observation, we offer an in-depth analysis of the Rademacher complexity associated with our modified models and prove an upper bound tied to their stochastic depth that existing NOs do not satisfy. Furthermore, we obtain a novel out-of-distribution risk bound tailored to Gaussian measures on Banach spaces, again relating stochastic depth with the bound. We conclude by proposing a hypernetwork version of the subfamily of NOs as a surrogate model for the mentioned forward operator.

估計/估計量 · INTERACT · 線性組合 · 泛函 · 邊緣化 ·

2023 年 7 月 4 日

Dimension Reduction and MARS

Yu Liu,Degui Li,Yingcun Xia

The multivariate adaptive regression spline (MARS) is one of the popular estimation methods for nonparametric multivariate regressions. However, as MARS is based on marginal splines, to incorporate interactions of covariates, products of the marginal splines must be used, which leads to an unmanageable number of basis functions when the order of interaction is high and results in low estimation efficiency. In this paper, we improve the performance of MARS by using linear combinations of the covariates which achieve sufficient dimension reduction. The special basis functions of MARS facilitate calculation of gradients of the regression function, and estimation of the linear combinations is obtained via eigen-analysis of the outer-product of the gradients. Under some technical conditions, the asymptotic theory is established for the proposed estimation method. Numerical studies including both simulation and empirical applications show its effectiveness in dimension reduction and improvement over MARS and other commonly-used nonparametric methods in regression estimation and prediction.

線性的 · 優化器 · binary · 通道 · 塊 ·

2023 年 7 月 4 日

On Optimal Finite-length Block Codes of Size Four for Binary Symmetric Channels

Yanyan Dong,Shenghao Yang

from arxiv, This is the full version of our papers at ISITA 2020 and ISIT 2023

A binary code of blocklength $n$ and codebook size $M$ is called an $(n,M)$ code, which is studied for memoryless binary symmetric channels (BSCs) with the maximum likelihood (ML) decoding. For any $n \geq 2$, some optimal codes among the linear $(n,4)$ codes have been explicitly characterized in the previous study, but whether the optimal codes among the linear codes are better than all the nonlinear codes or not is unknown. In this paper, we first show that for any $n\geq 2$, there exists an optimal code (among all the $(n,4)$ codes) that is either linear or in a subset of nonlinear codes, called Class-I codes. We identified all the optimal codes among the linear $(n,4)$ codes for each blocklength $n\geq 2$, and found ones that were not given in literature. For any $n$ from $2$ to $300$, all the optimal $(n,4)$ codes are identified, where except for $n=3$, all the optimal $(n,4)$ codes are equivalent to linear codes. There exist optimal $(3,4)$ codes that are not equivalent to linear codes. Furthermore, we derive a subset of nonlinear codes called Class-II codes and justify that for any $n >300$, the set composed of linear, Class-I and Class-II codes and their equivalent codes contains all the optimal $(n,4)$ codes. Both Class-I and Class-II codes are close to linear codes in the sense that they involve only one type of columns that are not included in linear codes. Our results are obtained using a new technique to compare the ML decoding performance of two codes, featured by a partition of the entire range of the channel output.

Performer · 情景 · 約束 · 拒絕采樣 · MoDELS ·

2023 年 7 月 4 日

On the Constrained Time-Series Generation Problem

Andrea Coletta,Sriram Gopalakrishan,Daniel Borrajo,Svitlana Vyetrenko

Synthetic time series are often used in practical applications to augment the historical time series dataset for better performance of machine learning algorithms, amplify the occurrence of rare events, and also create counterfactual scenarios described by the time series. Distributional-similarity (which we refer to as realism) as well as the satisfaction of certain numerical constraints are common requirements in counterfactual time series scenario generation requests. For instance, the US Federal Reserve publishes synthetic market stress scenarios given by the constrained time series for financial institutions to assess their performance in hypothetical recessions. Existing approaches for generating constrained time series usually penalize training loss to enforce constraints, and reject non-conforming samples. However, these approaches would require re-training if we change constraints, and rejection sampling can be computationally expensive, or impractical for complex constraints. In this paper, we propose a novel set of methods to tackle the constrained time series generation problem and provide efficient sampling while ensuring the realism of generated time series. In particular, we frame the problem using a constrained optimization framework and then we propose a set of generative methods including ``GuidedDiffTime'', a guided diffusion model to generate realistic time series. Empirically, we evaluate our work on several datasets for financial and energy data, where incorporating constraints is critical. We show that our approaches outperform existing work both qualitatively and quantitatively. Most importantly, we show that our ``GuidedDiffTime'' model is the only solution where re-training is not necessary for new constraints, resulting in a significant carbon footprint reduction.

CASE · 數值分析 ·

2023 年 7 月 3 日

Recovering coefficients in a system of semilinear Helmholtz equations from internal data

Kui Ren,Nathan Soedjak

We study an inverse problem for a coupled system of semilinear Helmholtz equations where we are interested in reconstructing multiple coefficients in the system from internal data measured in applications such as thermoacoustic imaging. We derive results on the uniqueness and stability of the inverse problem in the case of small boundary data based on the technique of first- and higher-order linearization. Numerical simulations are provided to illustrate the quality of reconstructions that can be expected from noisy data.

估計/估計量 · 隨機場 · 泛函 · 樣例 · Continuity ·

2023 年 7 月 1 日

Universal kernel-type estimation of random fields

Yu. Yu. Linke,I. S. Borisov,P. S. Ruzankin

from arxiv, To appear in Statistics

Consistent weighted least square estimators are proposed for a wide class of nonparametric regression models with random regression function, where this real-valued random function of $k$ arguments is assumed to be continuous with probability 1. We obtain explicit upper bounds for the rate of uniform convergence in probability of the new estimators to the unobservable random regression function for both fixed or random designs. In contrast to the predecessors' results, the bounds for the convergence are insensitive to the correlation structure of the $k$-variate design points. As an application, we study the problem of estimating the mean and covariance functions of random fields with additive noise under dense data conditions. The theoretical results of the study are illustrated by simulation examples which show that the new estimators are more accurate in some cases than the Nadaraya--Watson ones. An example of processing real data on earthquakes in Japan in 2012--2021 is included.

估計/估計量 · 自助法/自舉法 · MoDELS · Performer · 統計量 ·

2023 年 7 月 1 日

Bootstrapping the Cross-Validation Estimate

Bryan Cai,Fabio Pellegrini,Menglan Pang,Carl de Moor,Changyu Shen,Vivek Charu,Lu Tian

Cross-validation is a widely used technique for evaluating the performance of prediction models. It helps avoid the optimism bias in error estimates, which can be significant for models built using complex statistical learning algorithms. However, since the cross-validation estimate is a random value dependent on observed data, it is essential to accurately quantify the uncertainty associated with the estimate. This is especially important when comparing the performance of two models using cross-validation, as one must determine whether differences in error estimates are a result of chance fluctuations. Although various methods have been developed for making inferences on cross-validation estimates, they often have many limitations, such as stringent model assumptions This paper proposes a fast bootstrap method that quickly estimates the standard error of the cross-validation estimate and produces valid confidence intervals for a population parameter measuring average model performance. Our method overcomes the computational challenge inherent in bootstrapping the cross-validation estimate by estimating the variance component within a random effects model. It is just as flexible as the cross-validation procedure itself. To showcase the effectiveness of our approach, we employ comprehensive simulations and real data analysis across three diverse applications.