亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='a84i0'><del id='a84i0'><del id='a84i0'></del><pre id='a84i0'><pre id='a84i0'><option id='a84i0'><address id='a84i0'></address><bdo id='a84i0'><tr id='a84i0'><acronym id='a84i0'><pre id='a84i0'></pre></acronym><div id='a84i0'></div></tr></bdo></option></pre><small id='a84i0'><address id='a84i0'><u id='a84i0'><legend id='a84i0'><option id='a84i0'><abbr id='a84i0'></abbr><li id='a84i0'><pre id='a84i0'></pre></li></option></legend><select id='a84i0'></select></u></address></small></pre></del><sup id='a84i0'></sup><blockquote id='a84i0'><dt id='a84i0'></dt></blockquote><blockquote id='a84i0'></blockquote></dir><tt id='a84i0'></tt><u id='a84i0'><tt id='a84i0'><form id='a84i0'></form></tt><td id='a84i0'><dt id='a84i0'></dt></td></u>

<code id='a84i0'><i id='a84i0'><q id='a84i0'><legend id='a84i0'><pre id='a84i0'><style id='a84i0'><acronym id='a84i0'><i id='a84i0'><form id='a84i0'><option id='a84i0'><center id='a84i0'></center></option></form></i></acronym></style><tt id='a84i0'></tt></pre></legend></q></i></code><center id='a84i0'></center>

<dd id='a84i0'></dd>

<style id='a84i0'></style><sub id='a84i0'><dfn id='a84i0'><abbr id='a84i0'><big id='a84i0'><bdo id='a84i0'></bdo></big></abbr></dfn></sub>_{<dir id='a84i0'></dir>}

·

穩健性 · Neural Networks · Networking · 估計/估計量 · 預測器/決策函數 ·

2021 年 7 月 21 日

Robust Nonparametric Regression with Deep Neural Networks

Guohao Shen,Yuling Jiao,Yuanyuan Lin,Jian Huang

from arxiv, Guohao Shen and Yuling Jiao contributed equally to this work. Corresponding authors: Yuanyuan Lin (Email: [email protected]) and Jian Huang (Email: jian-). arXiv admin note: substantial text overlap with arXiv:2104.06708

In this paper, we study the properties of robust nonparametric estimation using deep neural networks for regression models with heavy tailed error distributions. We establish the non-asymptotic error bounds for a class of robust nonparametric regression estimators using deep neural networks with ReLU activation under suitable smoothness conditions on the regression function and mild conditions on the error term. In particular, we only assume that the error distribution has a finite p-th moment with p greater than one. We also show that the deep robust regression estimators are able to circumvent the curse of dimensionality when the distribution of the predictor is supported on an approximate lower-dimensional set. An important feature of our error bound is that, for ReLU neural networks with network width and network size (number of parameters) no more than the order of the square of the dimensionality d of the predictor, our excess risk bounds depend sub-linearly on d. Our assumption relaxes the exact manifold support assumption, which could be restrictive and unrealistic in practice. We also relax several crucial assumptions on the data distribution, the target regression function and the neural networks required in the recent literature. Our simulation studies demonstrate that the robust methods can significantly outperform the least squares method when the errors have heavy-tailed distributions and illustrate that the choice of loss function is important in the context of deep nonparametric regression.

相關內容

穩健性

估計/估計量 · INFORMS · 嶺回歸 · 預測器/決策函數 · 拉索回歸 ·

2021 年 9 月 23 日

High-dimensional regression with potential prior information on variable importance

Benjamin G. Stokell,Rajen D. Shah

from arxiv, 16 pages, 7 figures

There are a variety of settings where vague prior information may be available on the importance of predictors in high-dimensional regression settings. Examples include ordering on the variables offered by their empirical variances (which is typically discarded through standardisation), the lag of predictors when fitting autoregressive models in time series settings, or the level of missingness of the variables. Whilst such orderings may not match the true importance of variables, we argue that there is little to be lost, and potentially much to be gained, by using them. We propose a simple scheme involving fitting a sequence of models indicated by the ordering. We show that the computational cost for fitting all models when ridge regression is used is no more than for a single fit of ridge regression, and describe a strategy for Lasso regression that makes use of previous fits to greatly speed up fitting the entire sequence of models. We propose to select a final estimator by cross-validation and provide a general result on the quality of the best performing estimator on a test set selected from among a number $M$ of competing estimators in a high-dimensional linear regression setting. Our result requires no sparsity assumptions and shows that only a $\log M$ price is incurred compared to the unknown best estimator. We demonstrate the effectiveness of our approach when applied to missing or corrupted data, and time series settings. An R package is available on github.

Weight · 低秩矩陣近似 · 近似 · Extensibility · 秩 ·

2021 年 9 月 22 日

Weighted Low Rank Matrix Approximation and Acceleration

Elena Tuzhilina,Trevor Hastie

Low-rank matrix approximation is one of the central concepts in machine learning, with applications in dimension reduction, de-noising, multivariate statistical methodology, and many more. A recent extension to LRMA is called low-rank matrix completion (LRMC). It solves the LRMA problem when some observations are missing and is especially useful for recommender systems. In this paper, we consider an element-wise weighted generalization of LRMA. The resulting weighted low-rank matrix approximation technique therefore covers LRMC as a special case with binary weights. WLRMA has many applications. For example, it is an essential component of GLM optimization algorithms, where an exponential family is used to model the entries of a matrix, and the matrix of natural parameters admits a low-rank structure. We propose an algorithm for solving the weighted problem, as well as two acceleration techniques. Further, we develop a non-SVD modification of the proposed algorithm that is able to handle extremely high-dimensional data. We compare the performance of all the methods on a small simulation example as well as a real-data application.

BART · MoDELS · INTERACT · 預測器/決策函數 · Performer ·

2021 年 9 月 22 日

Semi-parametric Bayesian Additive Regression Trees

Estev?o B. Prado,Andrew C. Parnell,Nathan McJames,Ann O'Shea,Rafael A. Moral

We propose a new semi-parametric model based on Bayesian Additive Regression Trees (BART). In our approach, the response variable is approximated by a linear predictor and a BART model, where the first component is responsible for estimating the main effects and BART accounts for the non-specified interactions and non-linearities. The novelty in our approach lies in the way we change tree generation moves in BART to deal with confounding between the parametric and non-parametric components when they have covariates in common. Through synthetic and real-world examples, we demonstrate that the performance of the new semi-parametric BART is competitive when compared to regression models and other tree-based methods. The implementation of the proposed method is available at //github.com/ebprado/SP-BART.

估計/估計量 · 損失函數（機器學習） · Networking · 方陣 · 參數空間 ·

2021 年 9 月 22 日

Cramér-Rao bound-informed training of neural networks for quantitative MRI

Xiaoxia Zhang,Quentin Duchemin,Kangning Liu,Sebastian Flassbeck,Cem Gultekin,Carlos Fernandez-Granda,Jakob Assl?nder

from arxiv, Xiaoxia Zhang, Quentin Duchemin, and Kangning Liu contributed equally to this work

Neural networks are increasingly used to estimate parameters in quantitative MRI, in particular in magnetic resonance fingerprinting. Their advantages over the gold standard non-linear least square fitting are their superior speed and their immunity to the non-convexity of many fitting problems. We find, however, that in heterogeneous parameter spaces, i.e. in spaces in which the variance of the estimated parameters varies considerably, good performance is hard to achieve and requires arduous tweaking of the loss function, hyper parameters, and the distribution of the training data in parameter space. Here, we address these issues with a theoretically well-founded loss function: the Cram\'er-Rao bound (CRB) provides a theoretical lower bound for the variance of an unbiased estimator and we propose to normalize the squared error with respective CRB. With this normalization, we balance the contributions of hard-to-estimate and not-so-hard-to-estimate parameters and areas in parameter space, and avoid a dominance of the former in the overall training loss. Further, the CRB-based loss function equals one for a maximally-efficient unbiased estimator, which we consider the ideal estimator. Hence, the proposed CRB-based loss function provides an absolute evaluation metric. We compare a network trained with the CRB-based loss with a network trained with the commonly used means squared error loss and demonstrate the advantages of the former in numerical, phantom, and in vivo experiments.

學成 · 預測器/決策函數 · Neural Networks · Networking · Performer ·

2021 年 9 月 21 日

Learning PAC-Bayes Priors for Probabilistic Neural Networks

Maria Perez-Ortiz,Omar Rivasplata,Benjamin Guedj,Matthew Gleeson,Jingyu Zhang,John Shawe-Taylor,Miroslaw Bober,Josef Kittler

Recent works have investigated deep learning models trained by optimising PAC-Bayes bounds, with priors that are learnt on subsets of the data. This combination has been shown to lead not only to accurate classifiers, but also to remarkably tight risk certificates, bearing promise towards self-certified learning (i.e. use all the data to learn a predictor and certify its quality). In this work, we empirically investigate the role of the prior. We experiment on 6 datasets with different strategies and amounts of data to learn data-dependent PAC-Bayes priors, and we compare them in terms of their effect on test performance of the learnt predictors and tightness of their risk certificate. We ask what is the optimal amount of data which should be allocated for building the prior and show that the optimum may be dataset dependent. We demonstrate that using a small percentage of the prior-building data for validation of the prior leads to promising results. We include a comparison of underparameterised and overparameterised models, along with an empirical study of different training objectives and regularisation strategies to learn the prior distribution.

估計/估計量 · 穩健性 · 向量化 · MoDELS · 損失函數（機器學習） ·

2021 年 9 月 21 日

Robust Estimation of High-Dimensional Vector Autoregressive Models

Linbo Liu,Danna Zhang

High dimensional non-Gaussian time series data are increasingly encountered in a wide range of applications. Conventional estimation methods and technical tools are inadequate when it comes to ultra high dimensional and heavy-tailed data. We investigate robust estimation of high dimensional autoregressive models with fat-tailed innovation vectors by solving a regularized regression problem using convex robust loss function. As a significant improvement, the dimension can be allowed to increase exponentially with the sample size to ensure consistency under very mild moment conditions. To develop the consistency theory, we establish a new Bernstein type inequality for the sum of autoregressive models. Numerical results indicate a good performance of robust estimates.

估計/估計量 · 采樣法 · 核密度估計 · 累積分布函數 · 核化 ·

2021 年 9 月 21 日

Non-parametric Kernel-Based Estimation of Probability Distributions for Precipitation Modeling

Andrew Pavlides,Vasiliki Agou,Dionissios T. Hristopulos

from arxiv, 49 pages, 21 figures

The probability distribution of precipitation amount strongly depends on geography, climate zone, and time scale considered. Closed-form parametric probability distributions are not sufficiently flexible to provide accurate and universal models for precipitation amount over different time scales. In this paper we derive non-parametric estimates of the cumulative distribution function (CDF) of precipitation amount for wet time intervals. The CDF estimates are obtained by integrating the kernel density estimator leading to semi-explicit CDF expressions for different kernel functions. We investigate kernel-based CDF estimation with an adaptive plug-in bandwidth (KCDE), using both synthetic data sets and reanalysis precipitation data from the island of Crete (Greece). We show that KCDE provides better estimates of the probability distribution than the standard empirical (staircase) estimate and kernel-based estimates that use the normal reference bandwidth. We also demonstrate that KCDE enables the simulation of non-parametric precipitation amount distributions by means of the inverse transform sampling method.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

離散化 · 馬爾可夫鏈蒙特卡羅 · 潛在 · 可交換的 · 話題模型 ·

2018 年 1 月 15 日

Latent nested nonparametric priors

Federico Camerlenghi,David B. Dunson,Antonio Lijoi,Igor Prünster,Abel Rodríguez

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

Performer · 估計/估計量 · 經驗風險最小化 · 經驗風險 · 方差 ·

2017 年 12 月 14 日

Variance-based regularization with convex objectives

John Duchi,Hongseok Namkoong

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Neural Networks

估計(ji)/估計(ji)量

預測器/決策(ce)函數

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<dir id='a84i0'><del id='a84i0'><del id='a84i0'></del><pre id='a84i0'><pre id='a84i0'><option id='a84i0'><address id='a84i0'></address><bdo id='a84i0'><tr id='a84i0'><acronym id='a84i0'><pre id='a84i0'></pre></acronym><div id='a84i0'></div></tr></bdo></option></pre><small id='a84i0'><address id='a84i0'><u id='a84i0'><legend id='a84i0'><option id='a84i0'><abbr id='a84i0'></abbr><li id='a84i0'><pre id='a84i0'></pre></li></option></legend><select id='a84i0'></select></u></address></small></pre></del><sup id='a84i0'></sup><blockquote id='a84i0'><dt id='a84i0'></dt></blockquote><blockquote id='a84i0'></blockquote></dir><tt id='a84i0'></tt><u id='a84i0'><tt id='a84i0'><form id='a84i0'></form></tt><td id='a84i0'><dt id='a84i0'></dt></td></u>

<code id='a84i0'><i id='a84i0'><q id='a84i0'><legend id='a84i0'><pre id='a84i0'><style id='a84i0'><acronym id='a84i0'><i id='a84i0'><form id='a84i0'><option id='a84i0'><center id='a84i0'></center></option></form></i></acronym></style><tt id='a84i0'></tt></pre></legend></q></i></code><center id='a84i0'></center>

<dd id='a84i0'></dd>

<style id='a84i0'></style><sub id='a84i0'><dfn id='a84i0'><abbr id='a84i0'><big id='a84i0'><bdo id='a84i0'></bdo></big></abbr></dfn></sub>_{<dir id='a84i0'></dir>}