99热日韩这里只有国产中文精品-亚洲中文字幕久久久久

We study the classical problem of moment estimation of an underlying vector whose $n$ coordinates are implicitly defined through a series of updates in a data stream. We show that if the updates to the vector arrive in the random-order insertion-only model, then there exist space efficient algorithms with improved dependencies on the approximation parameter $\varepsilon$. In particular, for any real $p > 2$, we first obtain an algorithm for $F_p$ moment estimation using $\tilde{\mathcal{O}}\left(\frac{1}{\varepsilon^{4/p}}\cdot n^{1-2/p}\right)$ bits of memory. Our techniques also give algorithms for $F_p$ moment estimation with $p>2$ on arbitrary order insertion-only and turnstile streams, using $\tilde{\mathcal{O}}\left(\frac{1}{\varepsilon^{4/p}}\cdot n^{1-2/p}\right)$ bits of space and two passes, which is the first optimal multi-pass $F_p$ estimation algorithm up to $\log n$ factors. Finally, we give an improved lower bound of $\Omega\left(\frac{1}{\varepsilon^2}\cdot n^{1-2/p}\right)$ for one-pass insertion-only streams. Our results separate the complexity of this problem both between random and non-random orders, as well as one-pass and multi-pass streams.

相關內容

估計/估計量

關注 3

估計/估計量 · 貪心逐層預訓練 · 核化 · 特化 · CC ·

2021 年 6 月 28 日

Adaptive greedy algorithm for moderately large dimensions in kernel conditional density estimation

Minh-Lien Jeanne Nguyen,Claire Lacour,Vincent Rivoirard

This paper studies the estimation of the conditional density f (x, $\times$) of Y i given X i = x, from the observation of an i.i.d. sample (X i , Y i) $\in$ R d , i = 1,. .. , n. We assume that f depends only on r unknown components with typically r d. We provide an adaptive fully-nonparametric strategy based on kernel rules to estimate f. To select the bandwidth of our kernel rule, we propose a new fast iterative algorithm inspired by the Rodeo algorithm (Wasserman and Lafferty (2006)) to detect the sparsity structure of f. More precisely, in the minimax setting, our pointwise estimator, which is adaptive to both the regularity and the sparsity, achieves the quasi-optimal rate of convergence. Its computational complexity is only O(dn log n).

估計/估計量 · Weight · 計算成本 · FAST · 最大似然估計 ·

2021 年 6 月 27 日

Fast and stable modification of the Gauss-Newton method for low-rank signal estimation

Nikita Zvonarev,Nina Golyandina

from arxiv, arXiv admin note: text overlap with arXiv:2101.09779, arXiv:1803.01419

The weighted nonlinear least-squares problem for low-rank signal estimation is considered. The problem of constructing a numerical solution that is stable and fast for long time series is addressed. A modified weighted Gauss-Newton method, which can be implemented through the direct variable projection onto a space of low-rank signals, is proposed. For a weight matrix which provides the maximum likelihood estimator of the signal in the presence of autoregressive noise of order $p$ the computational cost of iterations is $O(N r^2 + N p^2 + r N \log N)$ as $N$ tends to infinity, where $N$ is the time-series length, $r$ is the rank of the approximating time series. Moreover, the proposed method can be applied to data with missing values, without increasing the computational cost. The method is compared with state-of-the-art methods based on the variable projection approach in terms of floating-point numerical stability and computational cost.

估計/估計量 · 異常點 · Weight · 可約的 · 推斷 ·

2021 年 6 月 26 日

Outlier-Resistant Estimators for Average Treatment Effect in Causal Inference

Kazuharu Harada,Hironori Fujisawa

Estimators for causal quantities sometimes suffer from outliers. We investigate outlier-resistant estimation for the average treatment effect (ATE) under challenging but realistic settings. We assume that the ratio of outliers is not necessarily small and that it can depend on covariates. We propose three types of estimators for the ATE, which combines the well-known inverse probability weighting (IPW)/doubly robust (DR) estimators with the density-power weight. Under heterogeneous contamination, our methods can reduce the bias caused by outliers. In particular, under homogeneous contamination, our estimators are approximately consistent with the true ATE. An influence-function-based analysis indicates that the adverse effect of outliers is negligible if the ratio of outliers is small even under heterogeneous contamination. We also derived the asymptotic properties of our estimators. We evaluated the performance of our estimators through Monte-Carlo simulations and real data analysis. The comparative methods, which estimate the median of the potential outcome, do not have enough outlier resistance. In experiments, our methods outperformed the comparative methods.

估計/估計量 · 線性的 · 廣義線性模型 · 線性模型 · 優化器 ·

2021 年 6 月 25 日

Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models

Marco Mondelli,Christos Thrampoulidis,Ramji Venkataramanan

from arxiv, 49 pages, 6 figures

We study the problem of recovering an unknown signal $\boldsymbol x$ given measurements obtained from a generalized linear model with a Gaussian sensing matrix. Two popular solutions are based on a linear estimator $\hat{\boldsymbol x}^{\rm L}$ and a spectral estimator $\hat{\boldsymbol x}^{\rm s}$. The former is a data-dependent linear combination of the columns of the measurement matrix, and its analysis is quite simple. The latter is the principal eigenvector of a data-dependent matrix, and a recent line of work has studied its performance. In this paper, we show how to optimally combine $\hat{\boldsymbol x}^{\rm L}$ and $\hat{\boldsymbol x}^{\rm s}$. At the heart of our analysis is the exact characterization of the joint empirical distribution of $(\boldsymbol x, \hat{\boldsymbol x}^{\rm L}, \hat{\boldsymbol x}^{\rm s})$ in the high-dimensional limit. This allows us to compute the Bayes-optimal combination of $\hat{\boldsymbol x}^{\rm L}$ and $\hat{\boldsymbol x}^{\rm s}$, given the limiting distribution of the signal $\boldsymbol x$. When the distribution of the signal is Gaussian, then the Bayes-optimal combination has the form $\theta\hat{\boldsymbol x}^{\rm L}+\hat{\boldsymbol x}^{\rm s}$ and we derive the optimal combination coefficient. In order to establish the limiting distribution of $(\boldsymbol x, \hat{\boldsymbol x}^{\rm L}, \hat{\boldsymbol x}^{\rm s})$, we design and analyze an Approximate Message Passing (AMP) algorithm whose iterates give $\hat{\boldsymbol x}^{\rm L}$ and approach $\hat{\boldsymbol x}^{\rm s}$. Numerical simulations demonstrate the improvement of the proposed combination with respect to the two methods considered separately.

近似誤差 · 近似 · ReLU · Networking · 寬度 ·

2021 年 6 月 24 日

Deep Network Approximation for Smooth Functions

Jianfeng Lu,Zuowei Shen,Haizhao Yang,Shijun Zhang

This paper establishes the optimal approximation error characterization of deep ReLU networks for smooth functions in terms of both width and depth simultaneously. To that end, we first prove that multivariate polynomials can be approximated by deep ReLU networks of width $\mathcal{O}(N)$ and depth $\mathcal{O}(L)$ with an approximation error $\mathcal{O}(N^{-L})$. Through local Taylor expansions and their deep ReLU network approximations, we show that deep ReLU networks of width $\mathcal{O}(N\ln N)$ and depth $\mathcal{O}(L\ln L)$ can approximate $f\in C^s([0,1]^d)$ with a nearly optimal approximation error $\mathcal{O}(\|f\|_{C^s([0,1]^d)}N^{-2s/d}L^{-2s/d})$. Our estimate is non-asymptotic in the sense that it is valid for arbitrary width and depth specified by $N\in\mathbb{N}^+$ and $L\in\mathbb{N}^+$, respectively.

估計/估計量 · 泛函導數 · 泛函 · 模型選擇 · MoDELS ·

2021 年 6 月 24 日

On a Projection Estimator of the Regression Function Derivative

Fabienne Comte,Nicolas Marie

from arxiv, 29 pages, 4 figures

In this paper, we study the estimation of the derivative of a regression function in a standard univariate regression model. The estimators are defined either by derivating nonparametric least-squares estimators of the regression function or by estimating the projection of the derivative. We prove two simple risk bounds allowing to compare our estimators. More elaborate bounds under a stability assumption are then provided. Bases and spaces on which we can illustrate our assumptions and first results are both of compact or non compact type, and we discuss the rates reached by our estimators. They turn out to be optimal in the compact case. Lastly, we propose a model selection procedure and prove the associated risk bound. To consider bases with a non compact support makes the problem difficult.

線性的 · 優化器 · 近似 · 秩 · 可約的 ·

2021 年 6 月 24 日

Optimal Fine-grained Hardness of Approximation of Linear Equations

Mitali Bafna,Nikhil Vyas

from arxiv, To appear in ICALP 2021

The problem of solving linear systems is one of the most fundamental problems in computer science, where given a satisfiable linear system $(A,b)$, for $A \in \mathbb{R}^{n \times n}$ and $b \in \mathbb{R}^n$, we wish to find a vector $x \in \mathbb{R}^n$ such that $Ax = b$. The current best algorithms for solving dense linear systems reduce the problem to matrix multiplication, and run in time $O(n^{\omega})$. We consider the problem of finding $\varepsilon$-approximate solutions to linear systems with respect to the $L_2$-norm, that is, given a satisfiable linear system $(A \in \mathbb{R}^{n \times n}, b \in \mathbb{R}^n)$, find an $x \in \mathbb{R}^n$ such that $||Ax - b||_2 \leq \varepsilon||b||_2$. Our main result is a fine-grained reduction from computing the rank of a matrix to finding $\varepsilon$-approximate solutions to linear systems. In particular, if the best known $O(n^\omega)$ time algorithm for computing the rank of $n \times O(n)$ matrices is optimal (which we conjecture is true), then finding an $\varepsilon$-approximate solution to a dense linear system also requires $\tilde{\Omega}(n^{\omega})$ time, even for $\varepsilon$ as large as $(1 - 1/\text{poly}(n))$. We also prove (under some modified conjectures for the rank-finding problem) optimal hardness of approximation for sparse linear systems, linear systems over positive semidefinite matrices, well-conditioned linear systems, and approximately solving linear systems with respect to the $L_p$-norm, for $p \geq 1$. At the heart of our results is a novel reduction from the rank problem to a decision version of the approximate linear systems problem. This reduction preserves properties such as matrix sparsity and bit complexity.

近似 · 線性的 · 流 · CASE · Extensibility ·

2021 年 6 月 24 日

Linear Space Streaming Lower Bounds for Approximating CSPs

Chi-Ning Chou,Alexander Golovnev,Madhu Sudan,Ameya Velingker,Santhoshini Velusamy

We consider the approximability of constraint satisfaction problems in the streaming setting. For every constraint satisfaction problem (CSP) on $n$ variables taking values in $\{0,\ldots,q-1\}$, we prove that improving over the trivial approximability by a factor of $q$ requires $\Omega(n)$ space even on instances with $O(n)$ constraints. We also identify a broad subclass of problems for which any improvement over the trivial approximability requires $\Omega(n)$ space. The key technical core is an optimal, $q^{-(k-1)}$-inapproximability for the case where every constraint is given by a system of $k-1$ linear equations $\bmod\; q$ over $k$ variables. Prior to our work, no such hardness was known for an approximation factor less than $1/2$ for any CSP. Our work builds on and extends the work of Kapralov and Krachun (Proc. STOC 2019) who showed a linear lower bound on any non-trivial approximation of the max cut in graphs. This corresponds roughly to the case of Max $k$-LIN-$\bmod\; q$ with $k=q=2$. Each one of the extensions provides non-trivial technical challenges that we overcome in this work.

單純形 · 估計/估計量 · 均方誤差 · 方陣 · CASE ·

2021 年 6 月 23 日

Asymptotic properties of Bernstein estimators on the simplex

Frédéric Ouimet

from arxiv, 22 pages, 1 figure

Bernstein estimators are well-known to avoid the boundary bias problem of traditional kernel estimators. The theoretical properties of these estimators have been studied extensively on compact intervals and hypercubes, but never on the simplex, except for the mean squared error of the density estimator in Tenbusch (1994) when $d = 2$. The simplex is an important case as it is the natural domain of compositional data. In this paper, we make an effort to prove several asymptotic results (bias, variance, mean squared error (MSE), mean integrated squared error (MISE), asymptotic normality, uniform strong consistency) for Bernstein estimators of cumulative distribution functions and density functions on the $d$-dimensional simplex. Our results generalize the ones in Leblanc (2012) and Babu et al. (2002), who treated the case $d = 1$, and significantly extend those found in Tenbusch (1994). In particular, our rates of convergence for the MSE and MISE are optimal.

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.