亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider regression estimation with modified ReLU neural networks in which network weight matrices are first modified by a function $\alpha$ before being multiplied by input vectors. We give an example of continuous, piecewise linear function $\alpha$ for which the empirical risk minimizers over the classes of modified ReLU networks with $l_1$ and squared $l_2$ penalties attain, up to a logarithmic factor, the minimax rate of prediction of unknown $\beta$-smooth function.

相關內容

Neural Stochastic Differential Equations (NSDEs) model the drift and diffusion functions of a stochastic process as neural networks. While NSDEs are known to make accurate predictions, their uncertainty quantification properties have been remained unexplored so far. We report the empirical finding that obtaining well-calibrated uncertainty estimations from NSDEs is computationally prohibitive. As a remedy, we develop a computationally affordable deterministic scheme which accurately approximates the transition kernel, when dynamics is governed by a NSDE. Our method introduces a bidimensional moment matching algorithm: vertical along the neural net layers and horizontal along the time direction, which benefits from an original combination of effective approximations. Our deterministic approximation of the transition kernel is applicable to both training and prediction. We observe in multiple experiments that the uncertainty calibration quality of our method can be matched by Monte Carlo sampling only after introducing high computational cost. Thanks to the numerical stability of deterministic training, our method also improves prediction accuracy.

Functional quadratic regression models postulate a polynomial relationship between a scalar response rather than a linear one. As in functional linear regression, vertical and specially high-leverage outliers may affect the classical estimators. For that reason, the proposal of robust procedures providing reliable estimators in such situations is an important issue. Taking into account that the functional polynomial model is equivalent to a regression model that is a polynomial of the same order in the functional principal component scores of the predictor processes, our proposal combines robust estimators of the principal directions with robust regression estimators based on a bounded loss function and a preliminary residual scale estimator. Fisher-consistency of the proposed method is derived under mild assumptions. The results of a numerical study show, for finite samples, the benefits of the robust proposal over the one based on sample principal directions and least squares. The usefulness of the proposed approach is also illustrated through the analysis of a real data set which also reveals that when the potential outliers are removed the classical and robust methods behave very similarly.

Point cloud compression (PCC) is a key enabler for various 3-D applications, owing to the universality of the point cloud format. Ideally, 3D point clouds endeavor to depict object/scene surfaces that are continuous. Practically, as a set of discrete samples, point clouds are locally disconnected and sparsely distributed. This sparse nature is hindering the discovery of local correlation among points for compression. Motivated by an analysis with fractal dimension, we propose a heterogeneous approach with deep learning for lossy point cloud geometry compression. On top of a base layer compressing a coarse representation of the input, an enhancement layer is designed to cope with the challenging geometric residual/details. Specifically, a point-based network is applied to convert the erratic local details to latent features residing on the coarse point cloud. Then a sparse convolutional neural network operating on the coarse point cloud is launched. It utilizes the continuity/smoothness of the coarse geometry to compress the latent features as an enhancement bit-stream that greatly benefits the reconstruction quality. When this bit-stream is unavailable, e.g., due to packet loss, we support a skip mode with the same architecture which generates geometric details from the coarse point cloud directly. Experimentation on both dense and sparse point clouds demonstrate the state-of-the-art compression performance achieved by our proposal. Our code is available at //github.com/InterDigitalInc/GRASP-Net.

In this paper we propose a method for nonparametric estimation and inference for heterogeneous bounds for causal effect parameters in general sample selection models where the initial treatment can affect whether a post-intervention outcome is observed or not. Treatment selection can be confounded by observable covariates while the outcome selection can be confounded by both observables and unobservables. The method provides conditional effect bounds as functions of policy relevant pre-treatment variables. It allows for conducting valid statistical inference on the unidentified conditional effect curves. We use a flexible semiparametric de-biased machine learning approach that can accommodate flexible functional forms and high-dimensional confounding variables between treatment, selection, and outcome processes. Easily verifiable high-level conditions for estimation and misspecification robust inference guarantees are provided as well.

It is well-known that the parameterized family of functions representable by fully-connected feedforward neural networks with ReLU activation function is precisely the class of piecewise linear functions with finitely many pieces. It is less well-known that for every fixed architecture of ReLU neural network, the parameter space admits positive-dimensional spaces of symmetries, and hence the local functional dimension near any given parameter is lower than the parametric dimension. In this work we carefully define the notion of functional dimension, show that it is inhomogeneous across the parameter space of ReLU neural network functions, and continue an investigation - initiated in [14] and [5] - into when the functional dimension achieves its theoretical maximum. We also study the quotient space and fibers of the realization map from parameter space to function space, supplying examples of fibers that are disconnected, fibers upon which functional dimension is non-constant, and fibers upon which the symmetry group acts non-transitively.

Categorical data are present in key areas such as health or supply chain, and this data require specific treatment. In order to apply recent machine learning models on such data, encoding is needed. In order to build interpretable models, one-hot encoding is still a very good solution, but such encoding creates sparse data. Gradient estimators are not suited for sparse data: the gradient is mainly considered as zero while it simply does not always exists, thus a novel gradient estimator is introduced. We show what this estimator minimizes in theory and show its efficiency on different datasets with multiple model architectures. This new estimator performs better than common estimators under similar settings. A real world retail dataset is also released after anonymization. Overall, the aim of this paper is to thoroughly consider categorical data and adapt models and optimizers to these key features.

Discrete data are abundant and often arise as counts or rounded data. These data commonly exhibit complex distributional features such as zero-inflation, over- or under-dispersion, boundedness, and heaping, which render many parametric models inadequate. Yet even for parametric regression models, conjugate priors and closed-form posteriors are typically unavailable, which necessitates approximations such as MCMC for posterior inference. This paper introduces a Bayesian modeling and algorithmic framework that enables semiparametric regression analysis for discrete data with Monte Carlo (not MCMC) sampling. The proposed approach pairs a nonparametric marginal model with a latent linear regression model to encourage both flexibility and interpretability, and delivers posterior consistency even under model misspecification. For a parametric or large-sample approximation of this model, we identify a class of conjugate priors with (pseudo) closed-form posteriors. All posterior and predictive distributions are available analytically or via Monte Carlo sampling. These tools are broadly useful for linear regression, nonlinear models via basis expansions, and variable selection with discrete data. Simulation studies demonstrate significant advantages in computing, prediction, estimation, and selection relative to existing alternatives. This novel approach is applied to self-reported mental health data that exhibit zero-inflation, overdispersion, boundedness, and heaping.

In this paper, we propose an analysis of Covid19 evolution and prediction on Romania combined with the mathematical model of SIRD, an extension of the classical model SIR, which includes the deceased as a separate category. The reason is that, because we can not fully trust the reported numbers of infected or recovered people, we base our analysis on the more reliable number of deceased people. In addition, one of the parameters of our model includes the proportion of infected and tested versus infected. Since there are many factors which have an impact on the evolution of the pandemic, we decide to treat the estimation and the prediction based on the previous 7 days of data, particularly important here being the number of deceased. We perform the estimation and prediction using neural networks in two steps. Firstly, by simulating data with our model, we train several neural networks which learn the parameters of the model. Secondly, we use an ensemble of ten of these neural networks to forecast the parameters from the real data of Covid19 in Romania. Many of these results are backed up by a theorem which guarantees that we can recover the parameters from the reported data.

We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $\epsilon n$ arbitrary outliers. We wish to estimate a $p$-dimensional parameter $b^*$ given such sample of a label-feature pair $(y,x)$ satisfying $y=\langle x,b^*\rangle+\xi$ with heavy-tailed $(x,\xi)$. We only assume $x$ is $L^4-L^2$ hypercontractive with constant $L>0$ and has covariance matrix $\Sigma$ with minimum eigenvalue $1/\mu^2>0$ and bounded condition number $\kappa>0$. The noise $\xi$ can be arbitrarily dependent on $x$ and nonsymmetric as long as $\xi x$ has finite covariance matrix $\Xi$. We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(\Sigma,\Xi)$ nor the operator norm of $\Xi$. With probability at least $1-\delta$, our proposed estimator attains the statistical rate $\mu^2\Vert\Xi\Vert^{1/2}(\frac{p}{n}+\frac{\log(1/\delta)}{n}+\epsilon)^{1/2}$ and breakdown-point $\epsilon\lesssim\frac{1}{L^4\kappa^2}$, both optimal in the $\ell_2$-norm, assuming the near-optimal minimum sample size $L^4\kappa^2(p\log p + \log(1/\delta))\lesssim n$, up to a log factor. To the best of our knowledge, this is the first computationally tractable algorithm satisfying simultaneously all the mentioned properties. Our estimator is based on a two-stage Multiplicative Weight Update algorithm. The first stage estimates a descent direction $\hat v$ with respect to the (unknown) pre-conditioned inner product $\langle\Sigma(\cdot),\cdot\rangle$. The second stage estimate the descent direction $\Sigma\hat v$ with respect to the (known) inner product $\langle\cdot,\cdot\rangle$, without knowing nor estimating $\Sigma$.

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

北京阿比特科技有限公司