午夜剧场成年免费视,亚洲色大成网站WWW久久久

We prove concentration inequalities and associated PAC bounds for continuous- and discrete-time additive functionals for possibly unbounded functions of multivariate, nonreversible diffusion processes. Our analysis relies on an approach via the Poisson equation allowing us to consider a very broad class of subexponentially ergodic processes. These results add to existing concentration inequalities for additive functionals of diffusion processes which have so far been only available for either bounded functions or for unbounded functions of processes from a significantly smaller class. We demonstrate the power of these exponential inequalities by two examples of very different areas. Considering a possibly high-dimensional parametric nonlinear drift model under sparsity constraints, we apply the continuous-time concentration results to validate the restricted eigenvalue condition for Lasso estimation, which is fundamental for the derivation of oracle inequalities. The results for discrete additive functionals are used to investigate the unadjusted Langevin MCMC algorithm for sampling of moderately heavy-tailed densities $\pi$. In particular, we provide PAC bounds for the sample Monte Carlo estimator of integrals $\pi(f)$ for polynomially growing functions $f$ that quantify sufficient sample and step sizes for approximation within a prescribed margin with high probability.

相關內容

Analysis

關注 0

統計量 · tuning · 推斷 · 平穩的 · 平穩分布 ·

2022 年 7 月 25 日

Statistical Inference with Stochastic Gradient Algorithms

Jeffrey Negrea,Jun Yang,Haoyue Feng,Daniel M. Roy,Jonathan H. Huggins

from arxiv, 41 pgs

Stochastic gradient algorithms are widely used for both optimization and sampling in large-scale learning and inference problems. However, in practice, tuning these algorithms is typically done using heuristics and trial-and-error rather than rigorous, generalizable theory. To address this gap between theory and practice, we novel insights into the effect of tuning parameters by characterizing the large-sample behavior of iterates of a very general class of preconditioned stochastic gradient algorithms with fixed step size. In the optimization setting, our results show that iterate averaging with a large fixed step size can result in statistically efficient approximation of the (local) M-estimator. In the sampling context, our results show that with appropriate choices of tuning parameters, the limiting stationary covariance can match either the Bernstein--von Mises limit of the posterior, adjustments to the posterior for model misspecification, or the asymptotic distribution of the MLE; and that with a naive tuning the limit corresponds to none of these. Moreover, we argue that an essentially independent sample from the stationary distribution can be obtained after a fixed number of passes over the dataset. We validate our asymptotic results in realistic finite-sample regimes via several experiments using simulated and real data. Overall, we demonstrate that properly tuned stochastic gradient algorithms with constant step size offer a computationally efficient and statistically robust approach to obtaining point estimates or posterior-like samples.

Processing（編程語言） · 估計/估計量 · 線性的 · 泛函 · Extensibility ·

2022 年 7 月 25 日

Bias-correction and Test for Mark-point Dependence with Replicated Marked Point Processes

Ganggang Xu,Jingfei Zhang,Yehua Li,Yongtao Guan

Mark-point dependence plays a critical role in research problems that can be fitted into the general framework of marked point processes. In this work, we focus on adjusting for mark-point dependence when estimating the mean and covariance functions of the mark process, given independent replicates of the marked point process. We assume that the mark process is a Gaussian process and the point process is a log-Gaussian Cox process, where the mark-point dependence is generated through the dependence between two latent Gaussian processes. Under this framework, naive local linear estimators ignoring the mark-point dependence can be severely biased. We show that this bias can be corrected using a local linear estimator of the cross-covariance function and establish uniform convergence rates of the bias-corrected estimators. Furthermore, we propose a test statistic based on local linear estimators for mark-point independence, which is shown to converge to an asymptotic normal distribution in a parametric $\sqrt{n}$-convergence rate. Model diagnostics tools are developed for key model assumptions and a robust functional permutation test is proposed for a more general class of mark-point processes. The effectiveness of the proposed methods is demonstrated using extensive simulations and applications to two real data examples.

社區發現 · 成對型 · 塊 · 估計/估計量 · MoDELS ·

2022 年 7 月 25 日

Pairwise Covariates-adjusted Block Model for Community Detection

Sihan Huang,Jiajin Sun,Yang Feng

from arxiv, 73 pages, 8 figures

One of the most fundamental problems in network study is community detection. The stochastic block model (SBM) is one popular model with different estimation methods developed with their community detection consistency results unveiled. However, the SBM is restricted by the strong assumption that all nodes in the same community are stochastically equivalent, which may not be suitable for practical applications. We introduce a pairwise covariates-adjusted stochastic block model (PCABM), a generalization of SBM that incorporates pairwise covariate information. We study the maximum likelihood estimates of the coefficients for the covariates as well as the community assignments. It is shown that both the coefficient estimates of the covariates and the community assignments are consistent under suitable sparsity conditions. Spectral clustering with adjustment (SCWA) is introduced to efficiently solve PCABM. Under certain conditions, we derive the error bound of community detection under SCWA and show that it is community detection consistent. In addition, the model selection in terms of the number of communities and the feature selection for the pairwise covariates are investigated, and two corresponding algorithms are proposed. PCABM compares favorably with the SBM or degree-corrected stochastic block model (DCBM) under a wide range of simulated and real networks when covariate information is accessible.

Projection · 向量化 · Principle · 不變 · 可理解性 ·

2022 年 7 月 25 日

Invariance principle of random projection for the norm

Juntao Duan,Ionel Popescu,Heinrich Matzinger

Johnson-Lindenstrauss guarantees certain topological structure is preserved under random projections when project high dimensional deterministic vectors to low dimensional vectors. In this work, we try to understand how random matrix affect norms of random vectors. In particular we prove the distribution of the norm of random vector $X \in \mathbb{R}^n$, whose entries are i.i.d. random variables, is preserved by random projection $S:\mathbb{R}^n \to \mathbb{R}^m$. More precisely, \[ \frac{X^TS^TSX - mn}{\sqrt{\sigma^2 m^2n+2mn^2}} \xrightarrow[\quad m/n\to 0 \quad ]{ m,n\to \infty } \mathcal{N}(0,1) \] We also prove a concentration of the random norm transformed by either random projection or random embedding. Overall, our results showed random matrix has low distortion for the norm of random vectors with i.i.d. entries.

賭博機/老虎機 · 線性的 · 設計矩陣 · Agent · 極小點 ·

2022 年 7 月 23 日

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

Debangshu Banerjee,Avishek Ghosh,Sayak Ray Chowdhury,Aditya Gopalan

We present a non-asymptotic lower bound on the eigenspectrum of the design matrix generated by any linear bandit algorithm with sub-linear regret when the action set has well-behaved curvature. Specifically, we show that the minimum eigenvalue of the expected design matrix grows as $\Omega(\sqrt{n})$ whenever the expected cumulative regret of the algorithm is $O(\sqrt{n})$, where $n$ is the learning horizon, and the action-space has a constant Hessian around the optimal arm. This shows that such action-spaces force a polynomial lower bound rather than a logarithmic lower bound, as shown by \cite{lattimore2017end}, in discrete (i.e., well-separated) action spaces. Furthermore, while the previous result is shown to hold only in the asymptotic regime (as $n \to \infty$), our result for these ``locally rich" action spaces is any-time. Additionally, under a mild technical assumption, we obtain a similar lower bound on the minimum eigen value holding with high probability. We apply our result to two practical scenarios -- \emph{model selection} and \emph{clustering} in linear bandits. For model selection, we show that an epoch-based linear bandit algorithm adapts to the true model complexity at a rate exponential in the number of epochs, by virtue of our novel spectral bound. For clustering, we consider a multi agent framework where we show, by leveraging the spectral result, that no forced exploration is necessary -- the agents can run a linear bandit algorithm and estimate their underlying parameters at once, and hence incur a low regret.

Analysis · 講稿 · 噪聲 · 數據可視化 · MoDELS ·

2022 年 7 月 22 日

Fiber Uncertainty Visualization for Bivariate Data With Parametric and Nonparametric Noise Models

Tushar M. Athawale,Chris R. Johnson,Sudhanshu Sane,David Pugmire

from arxiv, 9 pages paper + 2 page references, 10 figures, IEEE VIS 2022 paper to be published as a special issue of IEEE Transactions on Visualization and Computer Graphics (TVCG)

Visualization and analysis of multivariate data and their uncertainty are top research challenges in data visualization. Constructing fiber surfaces is a popular technique for multivariate data visualization that generalizes the idea of level-set visualization for univariate data to multivariate data. In this paper, we present a statistical framework to quantify positional probabilities of fibers extracted from uncertain bivariate fields. Specifically, we extend the state-of-the-art Gaussian models of uncertainty for bivariate data to other parametric distributions (e.g., uniform and Epanechnikov) and more general nonparametric probability distributions (e.g., histograms and kernel density estimation) and derive corresponding spatial probabilities of fibers. In our proposed framework, we leverage Green's theorem for closed-form computation of fiber probabilities when bivariate data are assumed to have independent parametric and nonparametric noise. Additionally, we present a nonparametric approach combined with numerical integration to study the positional probability of fibers when bivariate data are assumed to have correlated noise. For uncertainty analysis, we visualize the derived probability volumes for fibers via volume rendering and extracting level sets based on probability thresholds. We present the utility of our proposed techniques via experiments on synthetic and simulation datasets.

統計量 · 推斷 · 后驗推斷 · 頻率主義學派 · 可約的 ·

2022 年 7 月 22 日

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

Kush Bhatia,Nikki Lijing Kuang,Yi-An Ma,Yixin Wang

from arxiv, 56 pages, 8 figures

Variational inference has recently emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) in large-scale Bayesian inference. The core idea of variational inference is to trade statistical accuracy for computational efficiency. It aims to approximate the posterior, reducing computation costs but potentially compromising its statistical accuracy. In this work, we study this statistical and computational trade-off in variational inference via a case study in inferential model selection. Focusing on Gaussian inferential models (a.k.a. variational approximating families) with diagonal plus low-rank precision matrices, we initiate a theoretical study of the trade-offs in two aspects, Bayesian posterior inference error and frequentist uncertainty quantification error. From the Bayesian posterior inference perspective, we characterize the error of the variational posterior relative to the exact posterior. We prove that, given a fixed computation budget, a lower-rank inferential model produces variational posteriors with a higher statistical approximation error, but a lower computational error; it reduces variances in stochastic optimization and, in turn, accelerates convergence. From the frequentist uncertainty quantification perspective, we consider the precision matrix of the variational posterior as an uncertainty estimate. We find that, relative to the true asymptotic precision, the variational approximation suffers from an additional statistical error originating from the sampling uncertainty of the data. Moreover, this statistical error becomes the dominant factor as the computation budget increases. As a consequence, for small datasets, the inferential model need not be full-rank to achieve optimal estimation error. We finally demonstrate these statistical and computational trade-offs inference across empirical studies, corroborating the theoretical findings.

優化器 · 代價函數 · 泛函 · 代價 · Sphering ·

2022 年 7 月 22 日

Strong c-concavity and stability in optimal transport

Anatole Gallou?t,Quentin Mérigot,Boris Thibert

The stability of solutions to optimal transport problems under variation of the measures is fundamental from a mathematical viewpoint: it is closely related to the convergence of numerical approaches to solve optimal transport problems and justifies many of the applications of optimal transport. In this article, we introduce the notion of strong c-concavity, and we show that it plays an important role for proving stability results in optimal transport for general cost functions c. We then introduce a differential criterion for proving that a function is strongly c-concave, under an hypothesis on the cost introduced originally by Ma-Trudinger-Wang for establishing regularity of optimal transport maps. Finally, we provide two examples where this stability result can be applied, for cost functions taking value +$\infty$ on the sphere: the reflector problem and the Gaussian curvature measure prescription problem.

估計/估計量 · 統計量 · 規范化的 · 幾乎必然 · Networks ·

2022 年 7 月 21 日

Statistical Inference in Parametric Preferential Attachment Trees

Fengnan Gao,Aad van der Vaart

from arxiv, 36 pages, 2 figures, 4 tables

The preferential attachment (PA) model is a popular way of modeling dynamic social networks, such as collaboration networks. Assuming that the PA function takes a parametric form, we propose and study the maximum likelihood estimator of the parameter. Using a supercritical continuous-time branching process framework, we prove the almost sure consistency and asymptotic normality of this estimator. We also provide an estimator that only depends on the final snapshot of the network and prove its consistency, and its asymptotic normality under general conditions. We compare the performance of the estimators to a nonparametric estimator in a small simulation study.

Processing（編程語言） · Markov · 可辨認的 · 回合 · Branch ·

2022 年 7 月 20 日

Branching Processes in Random Environments with Thresholds

Giacomo Francisci,Anand N. Vidyashankar

from arxiv, 47 pages, 3 figures, 5 tables

Motivated by applications to COVID dynamics, we describe a branching process in random environments model $\{Z_n\}$ whose path behavior changes when crossing upper and lower thresholds. This introduces a cyclical path behavior involving periods of increase and decrease leading to supercritical and subcritical regimes. Even though the process is not Markov, we identify subsequences at random time points $\{(\tau_j, \nu_j)\}$ -- specifically the values of the process at crossing times, viz., $\{(Z_{\tau_j}, Z_{\nu_j})\}$ -- along which the process retains the Markov structure. Under mild moment and regularity conditions, we establish that the subsequences possess a regenerative structure and prove that the limiting normal distribution of the growth rates of the process in supercritical and subcritical regimes decouple. For this reason, we establish limit theorems concerning the length of supercritical and subcritical regimes and the proportion of time the process spends in these regimes. As a byproduct of our analysis, we explicitly identify the limiting variances in terms of the functionals of the offspring distribution, threshold distribution, and environmental sequences.