亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

I consider the estimation of the average treatment effect (ATE), in a population that can be divided into $G$ groups, and such that one has unbiased and uncorrelated estimators of the conditional average treatment effect (CATE) in each group. These conditions are for instance met in stratified randomized experiments. I assume that the outcome is homoscedastic, and that each CATE is bounded in absolute value by $B$ standard deviations of the outcome, for some known constant $B$. I derive, across all linear combinations of the CATEs' estimators, the estimator of the ATE with the lowest worst-case mean-squared error. This estimator assigns a weight equal to group $g$'s share in the population to the most precisely estimated CATEs, and a weight proportional to one over the estimator's variance to the least precisely estimated CATEs. Given $B$, this optimal estimator is feasible: the weights only depend on known quantities. I then allow for positive covariances known up to the outcome's variance between the estimators. This condition is met by differences-in-differences estimators in staggered adoption designs, if potential outcomes are homoscedastic and uncorrelated. Under those assumptions, I show that the minimax estimator is still feasible and can easily be computed. In realistic numerical examples, the minimax estimator can lead to substantial precision and worst-case MSE gains relative to the unbiased estimator.

相關內容

Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights. Many successful experimental results have been achieved with empirical straight-through (ST) approaches, proposing a variety of ad-hoc rules for propagating gradients through non-differentiable activations and updating discrete weights. At the same time, ST methods can be truly derived as estimators in the stochastic binary network (SBN) model with Bernoulli weights. We advance these derivations to a more complete and systematic study. We analyze properties, estimation accuracy, obtain different forms of correct ST estimators for activations and weights, explain existing empirical approaches and their shortcomings, explain how latent weights arise from the mirror descent method when optimizing over probabilities. This allows to reintroduce ST methods, long known empirically, as sound approximations, apply them with clarity and develop further improvements.

In this paper, we propose a reduced-bias estimator of the EVI for Pareto-type tails (heavy-tailed) distributions. This is derived using the weighted least squares method. It is shown that the estimator is unbiased, consistent and asymptotically normal under the second-order conditions on the underlying distribution of the data. The finite sample properties of the proposed estimator are studied through a simulation study. The results show that it is competitive to the existing estimators of the extreme value index in terms of bias and Mean Square Error. In addition, it yields estimates of $\gamma>0$ that are less sensitive to the number of top-order statistics, and hence, can be used for selecting an optimal tail fraction. The proposed estimator is further illustrated using practical datasets from pedochemical and insurance.

Modeling and drawing inference on the joint associations between single nucleotide polymorphisms and a disease has sparked interest in genome-wide associations studies. In the motivating Boston Lung Cancer Survival Cohort (BLCSC) data, the presence of a large number of single nucleotide polymorphisms of interest, though smaller than the sample size, challenges inference on their joint associations with the disease outcome. In similar settings, we find that neither the de-biased lasso approach (van de Geer et al. 2014), which assumes sparsity on the inverse information matrix, nor the standard maximum likelihood method can yield confidence intervals with satisfactory coverage probabilities for generalized linear models. Under this "large $n$, diverging $p$" scenario, we propose an alternative de-biased lasso approach by directly inverting the Hessian matrix without imposing the matrix sparsity assumption, which further reduces bias compared to the original de-biased lasso and ensures valid confidence intervals with nominal coverage probabilities. We establish the asymptotic distributions of any linear combinations of the parameter estimates, which lays the theoretical ground for drawing inference. Simulations show that the proposed refined de-biased estimating method performs well in removing bias and yields honest confidence interval coverage. We use the proposed method to analyze the aforementioned BLCSC data, a large scale hospital-based epidemiology cohort study, that investigates the joint effects of genetic variants on lung cancer risks.

Full-history recursive multilevel Picard (MLP) approximation schemes have been shown to overcome the curse of dimensionality in the numerical approximation of high-dimensional semilinear partial differential equations (PDEs) with general time horizons and Lipschitz continuous nonlinearities. However, each of the error analyses for MLP approximation schemes in the existing literature studies the $L^2$-root-mean-square distance between the exact solution of the PDE under consideration and the considered MLP approximation and none of the error analyses in the existing literature provides an upper bound for the more general $L^p$-distance between the exact solution of the PDE under consideration and the considered MLP approximation. It is the key contribution of this article to extend the $L^2$-error analysis for MLP approximation schemes in the literature to a more general $L^p$-error analysis with $p\in (0,\infty)$. In particular, the main result of this article proves that the proposed MLP approximation scheme indeed overcomes the curse of dimensionality in the numerical approximation of high-dimensional semilinear PDEs with the approximation error measured in the $L^p$-sense with $p \in (0,\infty)$.

We derive a posteriori error estimates for a fully discrete time-implicit finite element approximation of the stochastic total variaton flow (STVF) with additive space time noise. The estimates are first derived for an implementable fully discrete approximation of a regularized stochastic total variation flow. We then show that the derived a posteriori estimates remain valid for the unregularized flow up to a perturbation term that can be controlled by the regularization parameter. Based on the derived a posteriori estimates we propose a pathwise algorithm for the adaptive space-time refinement and perform numerical simulation for the regularized STVF to demonstrate the behavior of the proposed algorithm.

When studying treatment effects in multilevel studies, investigators commonly use (semi-)parametric estimators, which make strong parametric assumptions about the outcome, the treatment, and/or the correlation between individuals. We propose two nonparametric, doubly robust, asymptotically Normal estimators of treatment effects that do not make such assumptions. The first estimator is an extension of the cross-fitting estimator applied to clustered settings. The second estimator is a new estimator that uses conditional propensity scores and an outcome covariance model to improve efficiency. We apply our estimators in simulation and empirical studies and find that they consistently obtain the smallest standard errors.

This article develops new closed-form variance expressions for power analyses for commonly used difference-in-differences (DID) and comparative interrupted time series (CITS) panel data estimators. The main contribution is to incorporate variation in treatment timing into the analysis. The power formulas also account for other key design features that arise in practice: autocorrelated errors, unequal measurement intervals, and clustering due to the unit of treatment assignment. We consider power formulas for both cross-sectional and longitudinal models and allow for covariates. An illustrative power analysis provides guidance on appropriate sample sizes. The key finding is that accounting for treatment timing increases required sample sizes. Further, DID estimators have considerably more power than standard CITS and ITS estimators. An available Shiny R dashboard performs the sample size calculations for the considered estimators.

Counterfactual explanations are usually generated through heuristics that are sensitive to the search's initial conditions. The absence of guarantees of performance and robustness hinders trustworthiness. In this paper, we take a disciplined approach towards counterfactual explanations for tree ensembles. We advocate for a model-based search aiming at "optimal" explanations and propose efficient mixed-integer programming approaches. We show that isolation forests can be modeled within our framework to focus the search on plausible explanations with a low outlier score. We provide comprehensive coverage of additional constraints that model important objectives, heterogeneous data types, structural constraints on the feature space, along with resource and actionability restrictions. Our experimental analyses demonstrate that the proposed search approach requires a computational effort that is orders of magnitude smaller than previous mathematical programming algorithms. It scales up to large data sets and tree ensembles, where it provides, within seconds, systematic explanations grounded on well-defined models solved to optimality.

Conditional density estimation (CDE) models can be useful for many statistical applications, especially because the full conditional density is estimated instead of traditional regression point estimates, revealing more information about the uncertainty of the random variable of interest. In this paper, we propose a new methodology called Odds Conditional Density Estimator (OCDE) to estimate conditional densities in a supervised learning scheme. The main idea is that it is very difficult to estimate $p_{x,y}$ and $p_{x}$ in order to estimate the conditional density $p_{y|x}$, but by introducing an instrumental distribution, we transform the CDE problem into a problem of odds estimation, or similarly, training a binary probabilistic classifier. We demonstrate how OCDE works using simulated data and then test its performance against other known state-of-the-art CDE methods in real data. Overall, OCDE is competitive compared with these methods in real datasets.

From only positive (P) and unlabeled (U) data, a binary classifier could be trained with PU learning, in which the state of the art is unbiased PU learning. However, if its model is very flexible, empirical risks on training data will go negative, and we will suffer from serious overfitting. In this paper, we propose a non-negative risk estimator for PU learning: when getting minimized, it is more robust against overfitting, and thus we are able to use very flexible models (such as deep neural networks) given limited P data. Moreover, we analyze the bias, consistency, and mean-squared-error reduction of the proposed risk estimator, and bound the estimation error of the resulting empirical risk minimizer. Experiments demonstrate that our risk estimator fixes the overfitting problem of its unbiased counterparts.

北京阿比特科技有限公司