亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

With the rapid advances of data acquisition techniques, spatio-temporal data are becoming increasingly abundant in a diverse array of disciplines. Here we develop spatio-temporal regression methodology for analyzing large amounts of spatially referenced data collected over time, motivated by environmental studies utilizing remotely sensed satellite data. In particular, we specify a semiparametric autoregressive model without the usual Gaussian assumption and devise a computationally scalable procedure that enables the regression analysis of large datasets. We estimate the model parameters by quasi maximum likelihood and show that the computational complexity can be reduced from cubic to linear of the sample size. Asymptotic properties under suitable regularity conditions are further established that inform the computational procedure to be efficient and scalable. A simulation study is conducted to evaluate the finite-sample properties of the parameter estimation and statistical inference. We illustrate our methodology by a dataset with over 2.96 million observations of annual land surface temperature and the comparison with an existing state-of-the-art approach highlights the advantages of our method.

相關內容

Latent class analysis (LCA) is a useful tool to investigate the heterogeneity of a disease population with time-to-event data. We propose a new method based on non-parametric maximum likelihood estimator (NPMLE), which facilitates theoretically validated inference procedure for covariate effects and cumulative hazard functions. We assess the proposed method via extensive simulation studies and demonstrate improved predictive performance over standard Cox regression model. We further illustrate the practical utility of the proposed method through an application to a mild cognitive impairment (MCI) cohort dataset.

Wearable devices such as the ActiGraph are now commonly used in health studies to monitor or track physical activity. This trend aligns well with the growing need to accurately assess the effects of physical activity on health outcomes such as obesity. When accessing the association between these device-based physical activity measures with health outcomes such as body mass index, the device-based data is considered functions, while the outcome is a scalar-valued. The regression model applied in these settings is the scalar-on-function regression (SoFR). Most estimation approaches in SoFR assume that the functional covariates are precisely observed, or the measurement errors are considered random errors. Violation of this assumption can lead to both under-estimation of the model parameters and sub-optimal analysis. The literature on a measurement corrected approach in SoFR is sparse in the non-Bayesian literature and virtually non-existent in the Bayesian literature. This paper considers a fully nonparametric Bayesian measurement error corrected SoFR model that relaxes all the constraining assumptions often made in these models. Our estimation relies on an instrumental variable (IV) to identify the measurement error model. Finally, we introduce an IV quality scalar parameter that is jointly estimated along with all model parameters. Our method is easy to implement, and we demonstrate its finite sample properties through an extensive simulation. Finally, the developed methods are applied to the National Health and Examination Survey to assess the relationship between wearable-device-based measures of physical activity and body mass index among adults living in the United States.

We study frequentist risk properties of predictive density estimators for mean mixtures of multivariate normal distributions, involving an unknown location parameter $\theta \in \mathbb{R}^d$, and which include multivariate skew normal distributions. We provide explicit representations for Bayesian posterior and predictive densities, including the benchmark minimum risk equivariant (MRE) density, which is minimax and generalized Bayes with respect to an improper uniform density for $\theta$. For four dimensions or more, we obtain Bayesian densities that improve uniformly on the MRE density under Kullback-Leibler loss. We also provide plug-in type improvements, investigate implications for certain type of parametric restrictions on $\theta$, and illustrate and comment the findings based on numerical evaluations.

Algorithmic feature learners provide high-dimensional vector representations for non-matrix structured signals, like images, audio, text, and graphs. Low-dimensional projections derived from these representations can be used to explore variation across collections of these data. However, it is not clear how to assess the uncertainty associated with these projections. We adapt methods developed for bootstrapping principal components analysis to the setting where features are learned from non-matrix data. We empirically compare the derived confidence regions in simulations, varying factors that influence both feature learning and the bootstrap. Approaches are illustrated on spatial proteomic data. Code, data, and trained models are released as an R compendium.

Massive sized survival datasets are becoming increasingly prevalent with the development of the healthcare industry. Such datasets pose computational challenges unprecedented in traditional survival analysis use-cases. A popular way for coping with massive datasets is downsampling them to a more manageable size, such that the computational resources can be afforded by the researcher. Cox proportional hazards regression has remained one of the most popular statistical models for the analysis of survival data to-date. This work addresses the settings of right censored and possibly left truncated data with rare events, such that the observed failure times constitute only a small portion of the overall sample. We propose Cox regression subsampling-based estimators that approximate their full-data partial-likelihood-based counterparts, by assigning optimal sampling probabilities to censored observations, and including all observed failures in the analysis. Asymptotic properties of the proposed estimators are established under suitable regularity conditions, and simulation studies are carried out to evaluate the finite sample performance of the estimators. We further apply our procedure on UK-biobank colorectal cancer genetic and environmental risk factors.

Recently, conditional average treatment effect (CATE) estimation has been attracting much attention due to its importance in various fields such as statistics, social and biomedical sciences. This study proposes a partially linear nonparametric Bayes model for the heterogeneous treatment effect estimation. A partially linear model is a semiparametric model that consists of linear and nonparametric components in an additive form. A nonparametric Bayes model that uses a Gaussian process to model the nonparametric component has already been studied. However, this model cannot handle the heterogeneity of the treatment effect. In our proposed model, not only the nonparametric component of the model but also the heterogeneous treatment effect of the treatment variable is modeled by a Gaussian process prior. We derive the analytic form of the posterior distribution of the CATE and prove that the posterior has the consistency property. That is, it concentrates around the true distribution. We show the effectiveness of the proposed method through numerical experiments based on synthetic data.

Given a bichromatic point set $P=\textbf{R} \cup \textbf{B}$ of red and blue points, a separator is an object of a certain type that separates $\textbf{R}$ and $\textbf{B}$. We study the geometric separability problem when the separator is a) rectangular annulus of fixed orientation b) rectangular annulus of arbitrary orientation c) square annulus of fixed orientation d) orthogonal convex polygon. In this paper, we give polynomial time algorithms to construct separators of each of the above type that also optimizes a given parameter.

Long-term outcomes of experimental evaluations are necessarily observed after long delays. We develop semiparametric methods for combining the short-term outcomes of an experimental evaluation with observational measurements of the joint distribution of short-term and long-term outcomes to estimate long-term treatment effects. We characterize semiparametric efficiency bounds for estimation of the average effect of a treatment on a long-term outcome in several instances of this problem. These calculations facilitate the construction of semiparametrically efficient estimators. The finite-sample performance of these estimators is analyzed with a simulation calibrated to a randomized evaluation of the long-term effects of a poverty alleviation program.

With the development of deep learning, Deep Metric Learning (DML) has achieved great improvements in face recognition. Specifically, the widely used softmax loss in the training process often bring large intra-class variations, and feature normalization is only exploited in the testing process to compute the pair similarities. To bridge the gap, we impose the intra-class cosine similarity between the features and weight vectors in softmax loss larger than a margin in the training step, and extend it from four aspects. First, we explore the effect of a hard sample mining strategy. To alleviate the human labor of adjusting the margin hyper-parameter, a self-adaptive margin updating strategy is proposed. Then, a normalized version is given to take full advantage of the cosine similarity constraint. Furthermore, we enhance the former constraint to force the intra-class cosine similarity larger than the mean inter-class cosine similarity with a margin in the exponential feature projection space. Extensive experiments on Labeled Face in the Wild (LFW), Youtube Faces (YTF) and IARPA Janus Benchmark A (IJB-A) datasets demonstrate that the proposed methods outperform the mainstream DML methods and approach the state-of-the-art performance.

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

北京阿比特科技有限公司