四虎亚洲精品高清在线观看_丰满人妻被公侵犯高清版_在线观看91精品国产不卡_97碰久日韩视频在线观看_亚洲人成无码WWW久久_国产清纯小美女在线观看_精品国产一区二区免费不卡

The reach of a set $M \subset \mathbb R^d$, also known as condition number when $M$ is a manifold, was introduced by Federer in 1959 and is a central concept in geometric measure theory, set estimation, manifold learning, among others areas. We introduce a universally consistent estimate of the reach, just assuming that the reach is positive. A necessary condition for the universal convergence of the reach is that the Haussdorf distance between the sample and the set converges to zero. Without further assumptions we show that the convergence rate of this distance can be arbitrarily slow. However, under a weak additional assumption, we provide rates of convergence for the reach estimator. We also show that it is not possible to determine if the reach of the support of a density is zero or not based on a finite sample. We provide a small simulation study and a bias correction method for the case when $M$ is a manifold.

相關內容

估(gu)計/估(gu)計量

關注 3

噪聲 · 離散化 · Processing（編程語言） · 近似 · 隨機變量 ·

2021 年 12 月 21 日

SPDE bridges with observation noise and their spatial approximation

Giulia Di Nunno,Salvador Ortiz-Latorre,Andreas Petersson

from arxiv, 33 pages

This paper introduces SPDE bridges with observation noise and contains an analysis of their spatially semidiscrete approximations. The SPDEs are considered in the form of mild solutions in an abstract Hilbert space framework suitable for parabolic equations. They are assumed to be linear with additive noise in the form of a cylindrical Wiener process. The observational noise is also cylindrical and SPDE bridges are formulated via conditional distributions of Gaussian random variables in Hilbert spaces. A general framework for the spatial discretization of these bridge processes is introduced. Explicit convergence rates are derived for a spectral and a finite element based method. It is shown that for sufficiently rough observation noise, the rates are essentially the same as those of the corresponding discretization of the original SPDE.

估計/估計量 · MoDELS · 平滑 · Performer · tuning ·

2021 年 12 月 20 日

Nonparametric estimation of multivariate copula using empirical bayes method

Lu Lu,Sujit Ghosh

In the field of finance, insurance, and system reliability, etc., it is often of interest to measure the dependence among variables by modeling a multivariate distribution using a copula. The copula models with parametric assumptions are easy to estimate but can be highly biased when such assumptions are false, while the empirical copulas are non-smooth and often not genuine copula making the inference about dependence challenging in practice. As a compromise, the empirical Bernstein copula provides a smooth estimator but the estimation of tuning parameters remains elusive. In this paper, by using the so-called empirical checkerboard copula we build a hierarchical empirical Bayes model that enables the estimation of a smooth copula function for arbitrary dimensions. The proposed estimator based on the multivariate Bernstein polynomials is itself a genuine copula and the selection of its dimension-varying degrees is data-dependent. We also show that the proposed copula estimator provides a more accurate estimate of several multivariate dependence measures which can be obtained in closed form. We investigate the asymptotic and finite-sample performance of the proposed estimator and compare it with some nonparametric estimators through simulation studies. An application to portfolio risk management is presented along with a quantification of estimation uncertainty.

估計/估計量 · 隨機變量 · 向量化 · INFORMS · Performer ·

2021 年 12 月 19 日

A bivariate copula capturing the dependence of a random variable and a random vector, its estimation and applications

Sebastian Fuchs

from arxiv, 24 pages, 9 figures

We define a bivariate copula that captures the scale-invariant extent of dependence of a single random variable $Y$ on a set of potential explanatory random variables $X_1, \dots, X_d$. The copula itself contains the information whether $Y$ is completely dependent on $X_1, \dots, X_d$, and whether $Y$ and $X_1, \dots, X_d$ are independent. Evaluating this copula uniformly along the diagonal, i.e. calculating Spearman's footrule, leads to the so-called 'simple measure of conditional dependence' recently introduced by Azadkia and Chatterjee [1]. On the other hand, evaluating this copula uniformly over the unit square, i.e. calculating Spearman's rho, leads to a distribution-free coefficient of determination. Applying the techniques introduced in [1], we construct an estimate for this copula and show that this copula estimator is strongly consistent. Since, for $d=1$, the copula under consideration coincides with the well-known Markov product of copulas, as by-product, we also obtain a strongly consistent copula estimator for the Markov product. A simulation study illustrates the small sample performance of the proposed estimator.

約束 · 線性的 · 近似 · 示例 · 情景 ·

2021 年 12 月 18 日

Approximability of all finite CSPs with linear sketches

Chi-Ning Chou,Alexander Golovnev,Madhu Sudan,Santhoshini Velusamy

from arxiv, arXiv admin note: text overlap with arXiv:2102.12351

A constraint satisfaction problem (CSP), $\textsf{Max-CSP}(\mathcal{F})$, is specified by a finite set of constraints $\mathcal{F} \subseteq \{[q]^k \to \{0,1\}\}$ for positive integers $q$ and $k$. An instance of the problem on $n$ variables is given by $m$ applications of constraints from $\mathcal{F}$ to subsequences of the $n$ variables, and the goal is to find an assignment to the variables that satisfies the maximum number of constraints. In the $(\gamma,\beta)$-approximation version of the problem for parameters $0 \leq \beta < \gamma \leq 1$, the goal is to distinguish instances where at least $\gamma$ fraction of the constraints can be satisfied from instances where at most $\beta$ fraction of the constraints can be satisfied. In this work we consider the approximability of this problem in the context of sketching algorithms and give a dichotomy result. Specifically, for every family $\mathcal{F}$ and every $\beta < \gamma$, we show that either a linear sketching algorithm solves the problem in polylogarithmic space, or the problem is not solvable by any sketching algorithm in $o(\sqrt{n})$ space.

統計量 · 平滑 · 估計/估計量 · 最大平均偏差 · 經驗分布 ·

2021 年 12 月 17 日

Smooth $p$-Wasserstein Distance: Structure, Empirical Approximation, and Statistical Applications

Sloan Nietert,Ziv Goldfeld,Kengo Kato

from arxiv, updated to match ICML 2021 paper

Discrepancy measures between probability distributions, often termed statistical distances, are ubiquitous in probability theory, statistics and machine learning. To combat the curse of dimensionality when estimating these distances from data, recent work has proposed smoothing out local irregularities in the measured distributions via convolution with a Gaussian kernel. Motivated by the scalability of this framework to high dimensions, we investigate the structural and statistical behavior of the Gaussian-smoothed $p$-Wasserstein distance $\mathsf{W}_p^{(\sigma)}$, for arbitrary $p\geq 1$. After establishing basic metric and topological properties of $\mathsf{W}_p^{(\sigma)}$, we explore the asymptotic statistical behavior of $\mathsf{W}_p^{(\sigma)}(\hat{\mu}_n,\mu)$, where $\hat{\mu}_n$ is the empirical distribution of $n$ independent observations from $\mu$. We prove that $\mathsf{W}_p^{(\sigma)}$ enjoys a parametric empirical convergence rate of $n^{-1/2}$, which contrasts the $n^{-1/d}$ rate for unsmoothed $\mathsf{W}_p$ when $d \geq 3$. Our proof relies on controlling $\mathsf{W}_p^{(\sigma)}$ by a $p$th-order smooth Sobolev distance $\mathsf{d}_p^{(\sigma)}$ and deriving the limit distribution of $\sqrt{n}\,\mathsf{d}_p^{(\sigma)}(\hat{\mu}_n,\mu)$, for all dimensions $d$. As applications, we provide asymptotic guarantees for two-sample testing and minimum distance estimation using $\mathsf{W}_p^{(\sigma)}$, with experiments for $p=2$ using a maximum mean discrepancy formulation of $\mathsf{d}_2^{(\sigma)}$.

相似度 · 近似 · 成對型 · 可約的 · 半正定 ·

2021 年 12 月 17 日

Sublinear Time Approximation of Text Similarity Matrices

Archan Ray,Nicholas Monath,Andrew McCallum,Cameron Musco

from arxiv, 25 pages, 10 figures

We study algorithms for approximating pairwise similarity matrices that arise in natural language processing. Generally, computing a similarity matrix for $n$ data points requires $\Omega(n^2)$ similarity computations. This quadratic scaling is a significant bottleneck, especially when similarities are computed via expensive functions, e.g., via transformer models. Approximation methods reduce this quadratic complexity, often by using a small subset of exactly computed similarities to approximate the remainder of the complete pairwise similarity matrix. Significant work focuses on the efficient approximation of positive semidefinite (PSD) similarity matrices, which arise e.g., in kernel methods. However, much less is understood about indefinite (non-PSD) similarity matrices, which often arise in NLP. Motivated by the observation that many of these matrices are still somewhat close to PSD, we introduce a generalization of the popular Nystr\"{o}m method to the indefinite setting. Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix, producing a rank-$s$ approximation with just $O(ns)$ similarity computations. We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices arising in NLP tasks. We demonstrate high accuracy of the approximated similarity matrices in the downstream tasks of document classification, sentence similarity, and cross-document coreference.

估計/估計量 · 張成子空間 · 優化器 · 可辨認的 · Lipschitz ·

2021 年 12 月 17 日

Tree density estimation

László Gy?rfi,Aryeh Kontorovich,Roi Weiss

We study the problem of estimating the density $f(\boldsymbol x)$ of a random vector ${\boldsymbol X}$ in $\mathbb R^d$. For a spanning tree $T$ defined on the vertex set $\{1,\dots ,d\}$, the tree density $f_{T}$ is a product of bivariate conditional densities. An optimal spanning tree minimizes the Kullback-Leibler divergence between $f$ and $f_{T}$. From i.i.d. data we identify an optimal tree $T^*$ and efficiently construct a tree density estimate $f_n$ such that, without any regularity conditions on the density $f$, one has $\lim_{n\to \infty} \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol x=0$ a.s. For Lipschitz $f$ with bounded support, $\mathbb E \left\{ \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol x\right\}=O\big(n^{-1/4}\big)$, a dimension-free rate.

近似 · 模型評估 · Lipschitz · 平滑 · 泛函 ·

2021 年 12 月 16 日

Approximation with one-bit polynomials in Bernstein form

C. Sinan Güntürk,Weilin Li

from arxiv, 16 pages

We prove various approximation theorems with polynomials whose coefficients with respect to the Bernstein basis of a given order are all integers. In the extreme, we draw the coefficients from the set $\{ \pm 1\}$ only. We show that for any Lipschitz function $f:[0,1] \to [-1,1]$ and for any positive integer $n$, there are signs $\sigma_0,\dots,\sigma_n \in \{\pm 1\}$ such that $$\left |f(x) - \sum_{k=0}^n \sigma_k \, \binom{n}{k} x^k (1-x)^{n-k} \right | \leq \frac{C (1+|f|_{\mathrm{Lip}})}{1+\sqrt{nx(1-x)}} ~\mbox{ for all } x \in [0,1].$$ These polynomial approximations are not constrained by saturation of Bernstein polynomials, and we show that higher accuracy is indeed achievable for smooth functions: If $f$ has a Lipschitz $(s{-}1)$st derivative, then accuracy of order $O(n^{-s/2})$ is achievable with $\pm 1$ coefficients provided $\|f \|_\infty < 1$, and accuracy of order $O(n^{-s})$ is achievable with unrestricted integer coefficients. Our approximations are constructive in nature.

判別器 · GANs · 模式崩潰 · IPM · 多樣性 ·

2018 年 6 月 27 日

Approximability of Discriminators Implies Diversity in GANs

Yu Bai,Tengyu Ma,Andrej Risteski

While Generative Adversarial Networks (GANs) have empirically produced impressive results on learning complex real-world distributions, recent work has shown that they suffer from lack of diversity or mode collapse. The theoretical work of Arora et al.~\cite{AroraGeLiMaZh17} suggests a dilemma about GANs' statistical properties: powerful discriminators cause overfitting, whereas weak discriminators cannot detect mode collapse. In contrast, we show in this paper that GANs can in principle learn distributions in Wasserstein distance (or KL-divergence in many cases) with polynomial sample complexity, if the discriminator class has strong distinguishing power against the particular generator class (instead of against all possible generators). For various generator classes such as mixture of Gaussians, exponential families, and invertible neural networks generators, we design corresponding discriminators (which are often neural nets of specific architectures) such that the Integral Probability Metric (IPM) induced by the discriminators can provably approximate the Wasserstein distance and/or KL-divergence. This implies that if the training is successful, then the learned distribution is close to the true distribution in Wasserstein distance or KL divergence, and thus cannot drop modes. Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence, indicating that the lack of diversity may be caused by the sub-optimality in optimization instead of statistical inefficiency.

Performer · 估計/估計量 · 經驗風險最小化 · 經驗風險 · 方差 ·

2017 年 12 月 14 日

Variance-based regularization with convex objectives

John Duchi,Hongseok Namkoong

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.