在线点播亚洲日韩国产欧美_日韩一区国产二区不卡_亚洲一区二区三区99久久_2021国产最新自拍网站_黑人欧美又大又粗精小视频_亚洲视频免费在线看_人人艹人人舔人人干

$\ell_1$-penalized quantile regression is widely used for analyzing high-dimensional data with heterogeneity. It is now recognized that the $\ell_1$-penalty introduces non-negligible estimation bias, while a proper use of concave regularization may lead to estimators with refined convergence rates and oracle properties as the signal strengthens. Although folded concave penalized $M$-estimation with strongly convex loss functions have been well studied, the extant literature on quantile regression is relatively silent. The main difficulty is that the quantile loss is piecewise linear: it is non-smooth and has curvature concentrated at a single point. To overcome the lack of smoothness and strong convexity, we propose and study a convolution-type smoothed quantile regression with iteratively reweighted $\ell_1$-regularization. The resulting smoothed empirical loss is twice continuously differentiable and (provably) locally strongly convex with high probability. We show that the iteratively reweighted $\ell_1$-penalized smoothed quantile regression estimator, after a few iterations, achieves the optimal rate of convergence, and moreover, the oracle rate and the strong oracle property under an almost necessary and sufficient minimum signal strength condition. Extensive numerical studies corroborate our theoretical results.

相關內容

平滑

關注 1

線性的 · 前向 · 線性回歸 · 在線 · 嶺回歸 ·

2021 年 11 月 2 日

Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

Reda Ouhamma,Odalric Maillard,Vianney Perchet

from arxiv, 11+12 pages. To be published in the proceedings of NeurIPS 2021

We consider the problem of online linear regression in the stochastic setting. We derive high probability regret bounds for online ridge regression and the forward algorithm. This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions. Our study advocates for the use of the forward algorithm in lieu of ridge due to its enhanced bounds and robustness to the regularization parameter. Moreover, we explain how to integrate it in algorithms involving linear function approximation to remove a boundedness assumption without deteriorating theoretical bounds. We showcase this modification in linear bandit settings where it yields improved regret bounds. Last, we provide numerical experiments to illustrate our results and endorse our intuitions.

動量法 · 動量 · 線性回歸 · 估計/估計量 · 優化器 ·

2021 年 11 月 2 日

An Asymptotic Analysis of Minibatch-Based Momentum Methods for Linear Regression Models

Yuan Gao,Xuening Zhu,Haobo Qi,Guodong Li,Riquan Zhang,Hansheng Wang

from arxiv, 45 pages, 5 figures

Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored. To this end, we investigate the theoretical properties of MGDM methods based on the linear regression models. We first study the numerical convergence properties of the MGDM algorithm and further provide the theoretically optimal tuning parameters specification to achieve faster convergence rate. In addition, we explore the relationship between the statistical properties of the resulting MGDM estimator and the tuning parameters. Based on these theoretical findings, we give the conditions for the resulting estimator to achieve the optimal statistical efficiency. Finally, extensive numerical experiments are conducted to verify our theoretical results.

估計/估計量 · 線性回歸 · Performer · 穩健性 · 方陣 ·

2021 年 11 月 1 日

A robust partial least squares approach for function-on-function regression

Ufuk Beyaztas,Han Lin Shang

from arxiv, 27 pages, 8 figures, to appear in the Brazilian Journal of Probability and Statistics

The function-on-function linear regression model in which the response and predictors consist of random curves has become a general framework to investigate the relationship between the functional response and functional predictors. Existing methods to estimate the model parameters may be sensitive to outlying observations, common in empirical applications. In addition, these methods may be severely affected by such observations, leading to undesirable estimation and prediction results. A robust estimation method, based on iteratively reweighted simple partial least squares, is introduced to improve the prediction accuracy of the function-on-function linear regression model in the presence of outliers. The performance of the proposed method is based on the number of partial least squares components used to estimate the function-on-function linear regression model. Thus, the optimum number of components is determined via a data-driven error criterion. The finite-sample performance of the proposed method is investigated via several Monte Carlo experiments and an empirical data analysis. In addition, a nonparametric bootstrap method is applied to construct pointwise prediction intervals for the response function. The results are compared with some of the existing methods to illustrate the improvement potentially gained by the proposed method.

估計/估計量 · 線性模型 · 線性的 · MoDELS · 向量化 ·

2021 年 11 月 1 日

Support Recovery with Stochastic Gates: Theory and Application for Linear Models

Soham Jana,Henry Li,Yutaro Yamada,Ofir Lindenbaum

from arxiv, 12 pages, 3 figures, Corrected typos in this revision

We analyze the problem of simultaneous support recovery and estimation of the coefficient vector ($\beta^*$) in a linear model with independent and identically distributed Normal errors. We apply the penalized least square estimator based on non-linear penalties of stochastic gates (STG) [YLNK20] to estimate the coefficients. Considering Gaussian design matrices we show that under reasonable conditions on dimension and sparsity of $\beta^*$ the STG based estimator converges to the true data generating coefficient vector and also detects its support set with high probability. We propose a new projection based algorithm for linear models setup to improve upon the existing STG estimator that was originally designed for general non-linear models. Our new procedure outperforms many classical estimators for support recovery in synthetic data analysis.

估計/估計量 · 精度矩陣 · 查準率/準確率 · 圖 · 特化 ·

2021 年 10 月 31 日

Laplacian Constrained Precision Matrix Estimation: Existence and High Dimensional Consistency

Eduardo Pavez

This paper considers the problem of estimating high dimensional Laplacian constrained precision matrices by minimizing Stein's loss. We obtain a necessary and sufficient condition for existence of this estimator, that boils down to checking whether a certain data dependent graph is connected. We also prove consistency in the high dimensional setting under the symmetryzed Stein loss. We show that the error rate does not depend on the graph sparsity, or other type of structure, and that Laplacian constraints are sufficient for high dimensional consistency. Our proofs exploit properties of graph Laplacians, and a characterization of the proposed estimator based on effective graph resistances. We validate our theoretical claims with numerical experiments.

Microsoft Surface · 估計/估計量 · Performer · 相關系數 · COVID-19 ·

2021 年 10 月 30 日

Bayesian surface regression versus spatial spectral nonparametric curve regression

M. D. Ruiz-Medina,D. Miranda

COVID-19 incidence is analyzed at the provinces of some Spanish Communities during the period February-October, 2020. Two infinite-dimensional regression approaches are tested. The first one is implemented in the regression framework introduced in Ruiz-Medina, Miranda and Espejo (2019). Specifically, a bayesian framework is adopted in the estimation of the pure point spectrum of the temporal autocorrelation operator, characterizing the second-order structure of a surface sequence. The second approach is formulated in the context of spatial curve regression. A nonparametric estimator of the spectral density operator, based on the spatial periodogram operator, is computed to approximate the spatial correlation between curves. Dimension reduction is achieved by projection onto the empirical eigenvectors of the long-run spatial covariance operator. Cross-validation procedures are implemented to test the performance of the two functional regression approaches.

奇異的 · CASES · 極小點 · 正則化項 · CASE ·

2021 年 10 月 29 日

MINRES for second-order PDEs with singular data

Thomas Führer,Norbert Heuer,Michael Karkulik

Minimum residual methods such as the least-squares finite element method (FEM) or the discontinuous Petrov--Galerkin method with optimal test functions (DPG) usually exclude singular data, e.g., non square-integrable loads. We consider a DPG method and a least-squares FEM for the Poisson problem. For both methods we analyze regularization approaches that allow the use of $H^{-1}$ loads, and also study the case of point loads. For all cases we prove appropriate convergence orders. We present various numerical experiments that confirm our theoretical results. Our approach extends to general well-posed second-order problems.

連結 · 蒙特卡羅 · Integration · Weight · 泛函 ·

2021 年 10 月 29 日

A note on concatenation of quasi-Monte Carlo and plain Monte Carlo rules in high dimensions

Takashi Goda

from arxiv, major revision, 15 pages

In this note, we study a concatenation of quasi-Monte Carlo and plain Monte Carlo rules for high-dimensional numerical integration in weighted function spaces. In particular, we consider approximating the integral of periodic functions defined over the $s$-dimensional unit cube by using rank-1 lattice point sets only for the first $d\, (<s)$ coordinates and random points for the remaining $s-d$ coordinates. We prove that, by exploiting a decay of the weights of function spaces, almost the optimal order of the mean squared worst-case error is achieved by such a concatenated quadrature rule as long as $d$ scales at most linearly with the number of points. This result might be useful for numerical integration in extremely high dimensions, such as partial differential equations with random coefficients for which even the standard fast component-by-component algorithm is considered computationally expensive.

采樣法 · 方差 · 圖形處理器 · INFORMS · 泛化理論 ·

2020 年 6 月 24 日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Weilin Cong,Rana Forsati,Mahmut Kandemir,Mehrdad Mahdavi

Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.

Performer · 估計/估計量 · 經驗風險最小化 · 經驗風險 · 方差 ·

2017 年 12 月 14 日

Variance-based regularization with convex objectives

John Duchi,Hongseok Namkoong

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.