亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We propose a new iterative method using machine learning algorithms to fit an imprecise regression model to data that consist of intervals rather than point values. The method is based on a single-layer interval neural network which can be trained to produce an interval prediction. It seeks parameters for the optimal model that minimize the mean squared error between the actual and predicted interval values of the dependent variable using a first-order gradient-based optimization and interval analysis computations to model the measurement imprecision of the data. The method captures the relationship between the explanatory variables and a dependent variable by fitting an imprecise regression model, which is linear with respect to unknown interval parameters even the regression model is nonlinear. We consider the explanatory variables to be precise point values, but the measured dependent values are characterized by interval bounds without any probabilistic information. Thus, the imprecision is modeled non-probabilistically even while the scatter of dependent values is modeled probabilistically by homoscedastic Gaussian distributions. The proposed iterative method estimates the lower and upper bounds of the expectation region, which is an envelope of all possible precise regression lines obtained by ordinary regression analysis based on any configuration of real-valued points from the respective intervals and their x-values.

相關內容

In this paper we propose a new time-varying econometric model, called Time-Varying Poisson AutoRegressive with eXogenous covariates (TV-PARX), suited to model and forecast time series of counts. {We show that the score-driven framework is particularly suitable to recover the evolution of time-varying parameters and provides the required flexibility to model and forecast time series of counts characterized by convoluted nonlinear dynamics and structural breaks.} We study the asymptotic properties of the TV-PARX model and prove that, under mild conditions, maximum likelihood estimation (MLE) yields strongly consistent and asymptotically normal parameter estimates. Finite-sample performance and forecasting accuracy are evaluated through Monte Carlo simulations. The empirical usefulness of the time-varying specification of the proposed TV-PARX model is shown by analyzing the number of new daily COVID-19 infections in Italy and the number of corporate defaults in the US.

Implicit Processes (IPs) represent a flexible framework that can be used to describe a wide variety of models, from Bayesian neural networks, neural samplers and data generators to many others. IPs also allow for approximate inference in function-space. This change of formulation solves intrinsic degenerate problems of parameter-space approximate inference concerning the high number of parameters and their strong dependencies in large models. For this, previous works in the literature have attempted to employ IPs both to set up the prior and to approximate the resulting posterior. However, this has proven to be a challenging task. Existing methods that can tune the prior IP result in a Gaussian predictive distribution, which fails to capture important data patterns. By contrast, methods producing flexible predictive distributions by using another IP to approximate the posterior process cannot tune the prior IP to the observed data. We propose here the first method that can accomplish both goals. For this, we rely on an inducing-point representation of the prior IP, as often done in the context of sparse Gaussian processes. The result is a scalable method for approximate inference with IPs that can tune the prior IP parameters to the data, and that provides accurate non-Gaussian predictive distributions.

Model selection in machine learning (ML) is a crucial part of the Bayesian learning procedure. Model choice may impose strong biases on the resulting predictions, which can hinder the performance of methods such as Bayesian neural networks and neural samplers. On the other hand, newly proposed approaches for Bayesian ML exploit features of approximate inference in function space with implicit stochastic processes (a generalization of Gaussian processes). The approach of Sparse Implicit Processes (SIP) is particularly successful in this regard, since it is fully trainable and achieves flexible predictions. Here, we expand on the original experiments to show that SIP is capable of correcting model bias when the data generating mechanism differs strongly from the one implied by the model. We use synthetic datasets to show that SIP is capable of providing predictive distributions that reflect the data better than the exact predictions of the initial, but wrongly assumed model.

We propose a data-driven way to reduce the noise of covariance matrices of nonstationary systems. In the case of stationary systems, asymptotic approaches were proved to converge to the optimal solutions. Such methods produce eigenvalues that are highly dependent on the inputs, as common sense would suggest. Our approach proposes instead to use a set of eigenvalues totally independent from the inputs and that encode the long-term averaging of the influence of the future on present eigenvalues. Such an influence can be the predominant factor in nonstationary systems. Using real and synthetic data, we show that our data-driven method outperforms optimal methods designed for stationary systems for the filtering of both covariance matrix and its inverse, as illustrated by financial portfolio variance minimization, which makes out method generically relevant to many problems of multivariate inference.

Spatial data can exhibit dependence structures more complicated than can be represented using models that rely on the traditional assumptions of stationarity and isotropy. Several statistical methods have been developed to relax these assumptions. One in particular, the "spatial deformation approach" defines a transformation from the geographic space in which data are observed, to a latent space in which stationarity and isotropy are assumed to hold. Taking inspiration from this class of models, we develop a new model for spatially dependent data observed on graphs. Our method implies an embedding of the graph into Euclidean space wherein the covariance can be modeled using traditional covariance functions such as those from the Mat\'{e}rn family. This is done via a class of graph metrics compatible with such covariance functions. By estimating the edge weights which underlie these metrics, we can recover the "intrinsic distance" between nodes of a graph. We compare our model to existing methods for spatially dependent graph data, primarily conditional autoregressive (CAR) models and their variants and illustrate the advantages our approach has over traditional methods. We fit our model and competitors to bird abundance data for several species in North Carolina. We find that our model fits the data best, and provides insight into the interaction between species-specific spatial distributions and geography.

Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of $L_2$Boosting, which is tailored for regression, in a high-dimensional setting. Moreover, we introduce so-called \textquotedblleft post-Boosting\textquotedblright. This is a post-selection estimator which applies ordinary least squares to the variables selected in the first stage by $L_2$Boosting. Another variant is \textquotedblleft Orthogonal Boosting\textquotedblright\ where after each step an orthogonal projection is conducted. We show that both post-$L_2$Boosting and the orthogonal boosting achieve the same rate of convergence as LASSO in a sparse, high-dimensional setting. We show that the rate of convergence of the classical $L_2$Boosting depends on the design matrix described by a sparse eigenvalue constant. To show the latter results, we derive new approximation results for the pure greedy algorithm, based on analyzing the revisiting behavior of $L_2$Boosting. We also introduce feasible rules for early stopping, which can be easily implemented and used in applied work. Our results also allow a direct comparison between LASSO and boosting which has been missing from the literature. Finally, we present simulation studies and applications to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, post-$L_2$Boosting clearly outperforms LASSO.

We establish the minimax risk for parameter estimation in sparse high-dimensional Gaussian mixture models and show that a constrained maximum likelihood estimator (MLE) achieves the minimax optimality. However, the optimization-based constrained MLE is computationally intractable due to non-convexity of the problem. Therefore, we propose a Bayesian approach to estimate high-dimensional Gaussian mixtures whose cluster centers exhibit sparsity using a continuous spike-and-slab prior, and prove that the posterior contraction rate of the proposed Bayesian method is minimax optimal. The mis-clustering rate is obtained as a by-product using tools from matrix perturbation theory. Computationally, posterior inference of the proposed Bayesian method can be implemented via an efficient Gibbs sampler with data augmentation, circumventing the challenging frequentist nonconvex optimization-based algorithms. The proposed Bayesian sparse Gaussian mixture model does not require pre-specifying the number of clusters, which is allowed to grow with the sample size and can be adaptively estimated via posterior inference. The validity and usefulness of the proposed method is demonstrated through simulation studies and the analysis of a real-world single-cell RNA sequencing dataset.

This paper studies the design of two-wave experiments in the presence of spillover effects when the researcher aims to conduct precise inference on treatment effects. We consider units connected through a single network, local dependence among individuals, and a general class of estimands encompassing average treatment and average spillover effects. We introduce a statistical framework for designing two-wave experiments with networks, where the researcher optimizes over participants and treatment assignments to minimize the variance of the estimators of interest, using a first-wave (pilot) experiment to estimate the variance. We derive guarantees for inference on treatment effects and regret guarantees on the variance obtained from the proposed design mechanism. Our results illustrate the existence of a trade-off in the choice of the pilot study and formally characterize the pilot's size relative to the main experiment. Simulations using simulated and real-world networks illustrate the advantages of the method.

A time-varying zero-inflated serially dependent Poisson process is proposed. The model assumes that the intensity of the Poisson Process evolves according to a generalized autoregressive conditional heteroscedastic (GARCH) formulation. The proposed model is a generalization of the zero-inflated Poisson Integer GARCH model proposed by Fukang Zhu in 2012, which in return is a generalization of the Integer GARCH (INGARCH) model introduced by Ferland, Latour, and Oraichi in 2006. The proposed model builds on previous work by allowing the zero-inflation parameter to vary over time, governed by a deterministic function or by an exogenous variable. Both the Expectation Maximization (EM) and the Maximum Likelihood Estimation (MLE) approaches are presented as possible estimation methods. A simulation study shows that both parameter estimation methods provide good estimates. Applications to two real-life data sets show that the proposed INGARCH model provides a better fit than the traditional zero-inflated INGARCH model in the cases considered.

Deep operator learning has emerged as a promising tool for reduced-order modelling and PDE model discovery. Leveraging the expressive power of deep neural networks, especially in high dimensions, such methods learn the mapping between functional state variables. While proposed methods have assumed noise only in the dependent variables, experimental and numerical data for operator learning typically exhibit noise in the independent variables as well, since both variables represent signals that are subject to measurement error. In regression on scalar data, failure to account for noisy independent variables can lead to biased parameter estimates. With noisy independent variables, linear models fitted via ordinary least squares (OLS) will show attenuation bias, wherein the slope will be underestimated. In this work, we derive an analogue of attenuation bias for linear operator regression with white noise in both the independent and dependent variables. In the nonlinear setting, we computationally demonstrate underprediction of the action of the Burgers operator in the presence of noise in the independent variable. We propose error-in-variables (EiV) models for two operator regression methods, MOR-Physics and DeepONet, and demonstrate that these new models reduce bias in the presence of noisy independent variables for a variety of operator learning problems. Considering the Burgers operator in 1D and 2D, we demonstrate that EiV operator learning robustly recovers operators in high-noise regimes that defeat OLS operator learning. We also introduce an EiV model for time-evolving PDE discovery and show that OLS and EiV perform similarly in learning the Kuramoto-Sivashinsky evolution operator from corrupted data, suggesting that the effect of bias in OLS operator learning depends on the regularity of the target operator.

北京阿比特科技有限公司