嘘嘘中国免费观看网站-青青国产成人久久激情91

We construct bootstrap confidence intervals for a monotone regression function. It has been shown that the ordinary nonparametric bootstrap, based on the nonparametric least squares estimator (LSE) $\hat f_n$ is inconsistent in this situation. We show, however, that a consistent bootstrap can be based on the smoothed $\hat f_n$, to be called the SLSE (Smoothed Least Squares Estimator). The asymptotic pointwise distribution of the SLSE is derived. The confidence intervals, based on the smoothed bootstrap, are compared to intervals based on the (not necessarily monotone) Nadaraya Watson estimator and the effect of Studentization is investigated. We also give a method for automatic bandwidth choice, correcting work in Sen and Xu (2015). The procedure is illustrated using a well known dataset related to climate change.

相關內容

自助法/自舉法

關注 0

損失函數（機器學習） · Weight · 泛函 · 損失 · Networking ·

2023 年 7 月 11 日

Prediction intervals for neural network models using weighted asymmetric loss functions

Milo Grillo,Yunpeng Han,Agnieszka Werpachowska

from arxiv, 14 pages, 3 figures

We propose a simple and efficient approach to generate a prediction intervals (PI) for approximated and forecasted trends. Our method leverages a weighted asymmetric loss function to estimate the lower and upper bounds of the PI, with the weights determined by its coverage probability. We provide a concise mathematical proof of the method, show how it can be extended to derive PIs for parametrised functions and argue why the method works for predicting PIs of dependent variables. The presented tests of the method on a real-world forecasting task using a neural network-based model show that it can produce reliable PIs in complex machine learning scenarios.

估計/估計量 · 統計量 · 數據生成過程 · 樣本 · Processing（編程語言） ·

2023 年 7 月 11 日

Differentially Private Statistical Inference through $β$-Divergence One Posterior Sampling

Jack Jewson,Sahra Ghalebikesabi,Chris Holmes

Differential privacy guarantees allow the results of a statistical analysis involving sensitive data to be released without compromising the privacy of any individual taking part. Achieving such guarantees generally requires the injection of noise, either directly into parameter estimates or into the estimation process. Instead of artificially introducing perturbations, sampling from Bayesian posterior distributions has been shown to be a special case of the exponential mechanism, producing consistent, and efficient private estimates without altering the data generative process. The application of current approaches has, however, been limited by their strong bounding assumptions which do not hold for basic models, such as simple linear regressors. To ameliorate this, we propose $\beta$D-Bayes, a posterior sampling scheme from a generalised posterior targeting the minimisation of the $\beta$-divergence between the model and the data generating process. This provides private estimation that is generally applicable without requiring changes to the underlying model and consistently learns the data generating parameter. We show that $\beta$D-Bayes produces more precise inference estimation for the same privacy guarantees, and further facilitates differentially private estimation via posterior sampling for complex classifiers and continuous regression models such as neural networks for the first time.

方陣 · 推斷 · MoDELS · motivation · INFORMS ·

2023 年 7 月 11 日

Predicting milk traits from spectral data using Bayesian probabilistic partial least squares regression

Szymon Urbas,Pierre Lovera,Robert Daly,Alan O'Riordan,Donagh Berry,Isobel Claire Gormley

from arxiv, 36 pages, 6 figures; Supplement: 18 pages, 12 figures

High-dimensional spectral data -- routinely generated in dairy production -- are used to predict a range of traits in milk products. Partial least squares regression (PLSR) is ubiquitously used for these prediction tasks. However PLSR is not typically viewed as arising from statistical inference of a probabilistic model, and parameter uncertainty is rarely quantified. Additionally, PLSR does not easily lend itself to model-based modifications, coherent prediction intervals are not readily available, and the process of choosing the latent-space dimension, $\mathtt{Q}$, can be subjective and sensitive to data size. We introduce a Bayesian latent-variable model, emulating the desirable properties of PLSR while accounting for parameter uncertainty. The need to choose $\mathtt{Q}$ is eschewed through a nonparametric shrinkage prior. The flexibility of the proposed Bayesian partial least squares regression (BPLSR) framework is exemplified by considering sparsity modifications and allowing for multivariate response prediction. The BPLSR framework is used in two motivating settings: 1) trait prediction from mid-infrared spectral analyses of milk samples, and 2) milk pH prediction from surface-enhanced Raman spectral data. The prediction performance of BPLSR at least matches that of PLSR. Additionally, the provision of correctly calibrated prediction intervals objectively provides richer, more informative inference for stakeholders in dairy production.

穩健性 · 規范化的 · 噪聲 · 流形 · 估計/估計量 ·

2023 年 7 月 11 日

Robust Inference of Manifold Density and Geometry by Doubly Stochastic Scaling

Boris Landa,Xiuyuan Cheng

The Gaussian kernel and its traditional normalizations (e.g., row-stochastic) are popular approaches for assessing similarities between data points. Yet, they can be inaccurate under high-dimensional noise, especially if the noise magnitude varies considerably across the data, e.g., under heteroskedasticity or outliers. In this work, we investigate a more robust alternative -- the doubly stochastic normalization of the Gaussian kernel. We consider a setting where points are sampled from an unknown density on a low-dimensional manifold embedded in high-dimensional space and corrupted by possibly strong, non-identically distributed, sub-Gaussian noise. We establish that the doubly stochastic affinity matrix and its scaling factors concentrate around certain population forms, and provide corresponding finite-sample probabilistic error bounds. We then utilize these results to develop several tools for robust inference under general high-dimensional noise. First, we derive a robust density estimator that reliably infers the underlying sampling density and can substantially outperform the standard kernel density estimator under heteroskedasticity and outliers. Second, we obtain estimators for the pointwise noise magnitudes, the pointwise signal magnitudes, and the pairwise Euclidean distances between clean data points. Lastly, we derive robust graph Laplacian normalizations that accurately approximate various manifold Laplacians, including the Laplace Beltrami operator, improving over traditional normalizations in noisy settings. We exemplify our results in simulations and on real single-cell RNA-sequencing data. For the latter, we show that in contrast to traditional methods, our approach is robust to variability in technical noise levels across cell types.

嶺回歸 · 規范化的 · 估計/估計量 · contrastive · 相同 ·

2023 年 7 月 9 日

Nonlinear Generalized Ridge Regression

Robert L. Obenchain

from arxiv, 9 pages, 3 Figures, 3 Tables, 11 References

A Two-Stage approach is described that literally "straighten outs" any potentially nonlinear relationship between a y-outcome variable and each of p = 2 or more potential x-predictor variables. The y-outcome is then predicted from all p of these "linearized" spline-predictors using the form of Generalized Ridge Regression that is most likely to yield minimal MSE risk under Normal distribution-theory. These estimates are then compared and contrasted with those from the Generalized Additive Model that uses the same x-variables.

估計/估計量 · TD · 情景 · 分解 · Extensibility ·

2023 年 7 月 9 日

Decomposing Triple-Differences Regression under Staggered Adoption

Anton Strezhnev

The triple-differences (TD) design is a popular identification strategy for causal effects in settings where researchers do not believe the parallel trends assumption of conventional difference-in-differences (DiD) is satisfied. TD designs augment the conventional 2x2 DiD with a "placebo" stratum -- observations that are nested in the same units and time periods but are known to be entirely unaffected by the treatment. However, many TD applications go beyond this simple 2x2x2 and use observations on many units in many "placebo" strata across multiple time periods. A popular estimator for this setting is the triple-differences regression (TDR) fixed-effects estimator -- an extension of the common "two-way fixed effects" estimator for DiD. This paper decomposes the TDR estimator into its component two-group/two-period/two-strata triple-differences and illustrates how interpreting this parameter causally in settings with arbitrary staggered adoption requires strong effect homogeneity assumptions as many placebo DiDs incorporate observations under treatment. The decomposition clarifies the implied identifying variation behind the triple-differences regression estimator and suggests researchers should be cautious when implementing these estimators in settings more complex than the 2x2x2 case. Alternative approaches that only incorporate "clean placebos" such as direct imputation of the counterfactual may be more appropriate. The paper concludes by demonstrating the utility of this imputation estimator in an application of the "gravity model" to the estimation of the effect of the WTO/GATT on international trade.

估計/估計量 · Networking · Neural Networks · 泛函 · 損失函數（機器學習） ·

2023 年 7 月 8 日

Sup-Norm Convergence of Deep Neural Network Estimator for Nonparametric Regression by Adversarial Training

Masaaki Imaizumi

from arxiv, 38 pages

We show the sup-norm convergence of deep neural network estimators with a novel adversarial training scheme. For the nonparametric regression problem, it has been shown that an estimator using deep neural networks can achieve better performances in the sense of the $L2$-norm. In contrast, it is difficult for the neural estimator with least-squares to achieve the sup-norm convergence, due to the deep structure of neural network models. In this study, we develop an adversarial training scheme and investigate the sup-norm convergence of deep neural network estimators. First, we find that ordinary adversarial training makes neural estimators inconsistent. Second, we show that a deep neural network estimator achieves the optimal rate in the sup-norm sense by the proposed adversarial training with correction. We extend our adversarial training to general setups of a loss function and a data-generating function. Our experiments support the theoretical findings.

Minimax · 優化器 · 平滑 · 泛函 · 相互獨立的 ·

2023 年 7 月 8 日

Phase transitions in nonparametric regressions

Ying Zhu

from arxiv, 4 Tables

When the unknown regression function of a single variable is known to have derivatives up to the $(\gamma+1)$th order bounded in absolute values by a common constant everywhere or a.e. (i.e., $(\gamma+1)$th degree of smoothness), the minimax optimal rate of the mean integrated squared error (MISE) is stated as $\left(\frac{1}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$ in the literature. This paper shows that: (i) if $n\leq\left(\gamma+1\right)^{2\gamma+3}$, the minimax optimal MISE rate is $\frac{\log n}{n\log(\log n)}$ and the optimal degree of smoothness to exploit is roughly $\max\left\{ \left\lfloor \frac{\log n}{2\log\left(\log n\right)}\right\rfloor ,\,1\right\} $; (ii) if $n>\left(\gamma+1\right)^{2\gamma+3}$, the minimax optimal MISE rate is $\left(\frac{1}{n}\right)^{\frac{2\gamma+2}{2\gamma+3}}$ and the optimal degree of smoothness to exploit is $\gamma+1$. The fundamental contribution of this paper is a set of metric entropy bounds we develop for smooth function classes. Some of our bounds are original, and some of them improve and/or generalize the ones in the literature (e.g., Kolmogorov and Tikhomirov, 1959). Our metric entropy bounds allow us to show phase transitions in the minimax optimal MISE rates associated with some commonly seen smoothness classes as well as non-standard smoothness classes, and can also be of independent interest outside the nonparametric regression problems.

泛函 · 估計/估計量 · Cognition · MoDELS · 變換 ·

2023 年 7 月 7 日

Density-on-Density Regression

Yi Zhao,Abhirup Datta,Bohao Tang,Vadim Zipunnikov,Brian S. Caffo

In this study, a density-on-density regression model is introduced, where the association between densities is elucidated via a warping function. The proposed model has the advantage of a being straightforward demonstration of how one density transforms into another. Using the Riemannian representation of density functions, which is the square-root function (or half density), the model is defined in the correspondingly constructed Riemannian manifold. To estimate the warping function, it is proposed to minimize the average Hellinger distance, which is equivalent to minimizing the average Fisher-Rao distance between densities. An optimization algorithm is introduced by estimating the smooth monotone transformation of the warping function. Asymptotic properties of the proposed estimator are discussed. Simulation studies demonstrate the superior performance of the proposed approach over competing approaches in predicting outcome density functions. Applying to a proteomic-imaging study from the Alzheimer's Disease Neuroimaging Initiative, the proposed approach illustrates the connection between the distribution of protein abundance in the cerebrospinal fluid and the distribution of brain regional volume. Discrepancies among cognitive normal subjects, patients with mild cognitive impairment, and Alzheimer's disease (AD) are identified and the findings are in line with existing knowledge about AD.

Learning · 主動學習 · 白盒 · MoDELS · 黑盒 ·

2023 年 7 月 7 日

Black-Box Batch Active Learning for Regression

Andreas Kirsch

from arxiv, 12 pages + 11 pages appendix

Batch active learning is a popular approach for efficiently training machine learning models on large, initially unlabelled datasets by repeatedly acquiring labels for batches of data points. However, many recent batch active learning methods are white-box approaches and are often limited to differentiable parametric models: they score unlabeled points using acquisition functions based on model embeddings or first- and second-order derivatives. In this paper, we propose black-box batch active learning for regression tasks as an extension of white-box approaches. Crucially, our method only relies on model predictions. This approach is compatible with a wide range of machine learning models, including regular and Bayesian deep learning models and non-differentiable models such as random forests. It is rooted in Bayesian principles and utilizes recent kernel-based approaches. This allows us to extend a wide range of existing state-of-the-art white-box batch active learning methods (BADGE, BAIT, LCMD) to black-box models. We demonstrate the effectiveness of our approach through extensive experimental evaluations on regression datasets, achieving surprisingly strong performance compared to white-box approaches for deep learning models.