一级片免费电影看黄片免费,国产精品午夜无码AV天美

In the study of causal inference, statisticians show growing interest in estimating and analyzing heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy and interpretability when developing a desirable estimator for treatment effects. To make efforts to address the issue, we propose a non-parametric framework for estimating the Conditional Average Treatment Effect (CATE) function in this paper. The framework integrates two components: (i) leverage the joint use of propensity and prognostic scores in a matching algorithm to obtain a proxy of the heterogeneous treatment effects for each observation, (ii) utilize non-parametric regression trees to construct an estimator for the CATE function conditioning on the two scores. The method naturally stratifies treatment effects into subgroups over a 2d grid whose axis are the propensity and prognostic scores. We conduct benchmark experiments on multiple simulated data and demonstrate clear advantages of the proposed estimator over state of the art methods. We also evaluate empirical performance in real-life settings, using two observational social studies in the United States, and interpret policy implications following the numerical results.

相關內容

估計/估計量

關注 3

估計/估計量 · 方差 · 求逆 · INFORMS · MoDELS ·

2021 年 11 月 30 日

Efficiency of Regression (Un)-Adjusted Rosenbaum's Rank-based Estimator in Randomized Experiments

Aditya Ghosh,Nabarun Deb,Bikram Karmakar,Bodhisattva Sen

A completely randomized experiment allows us to estimate the causal effect by the difference in the averages of the outcome under the treatment and control. But, difference-in-means type estimators behave poorly if the potential outcomes have a heavy-tail, or contain a few extreme observations or outliers. We study an alternative estimator by Rosenbaum that estimates the causal effect by inverting a randomization test using ranks. We study the asymptotic properties of this estimator and develop a framework to compare the efficiencies of different estimators of the treatment effect in the setting of randomized experiments. In particular, we show that the Rosenbaum estimator has variance that is asymptotically, in the worst case, at most about 1.16 times the variance of the difference-in-means estimator, and can be much smaller when the potential outcomes are not light-tailed. We further derive a consistent estimator of the asymptotic standard error for the Rosenbaum estimator which immediately yields a readily computable confidence interval for the treatment effect, thereby alleviating the expensive numerical calculations needed to implement the original proposal of Rosenbaum. Further, we propose a regression adjusted version of the Rosenbaum estimator to incorporate additional covariate information in randomization inference. We prove gain in efficiency by this regression adjustment method under a linear regression model. Finally, we illustrate through simulations that, unlike the difference-in-means based estimators, either unadjusted or regression adjusted, these rank-based estimators are efficient and robust against heavy-tailed distributions, contamination, and various model misspecifications.

評分函數 · 得分 · 泛函 · Weight · 值域 ·

2021 年 11 月 30 日

Evaluation of point forecasts for extreme events using consistent scoring functions

Robert J. Taggart

from arxiv, 17 pages, 6 figures, 4 tables. This version clarifies hypotheses for Corollary 1 on the functions g and phi. See the published version at Quarterly Journal of the Royal Meteorological Society (//doi.org/10.1002/qj.4206) for additional content aimed to increase the accessibility of this paper, particularly for practitioners

We present a method for comparing point forecasts in a region of interest, such as the tails or centre of a variable's range. This method cannot be hedged, in contrast to conditionally selecting events to evaluate and then using a scoring function that would have been consistent (or proper) prior to event selection. Our method also gives decompositions of scoring functions that are consistent for the mean or a particular quantile or expectile. Each member of each decomposition is itself a consistent scoring function that emphasises performance over a selected region of the variable's range. The score of each member of the decomposition has a natural interpretation rooted in optimal decision theory. It is the weighted average of economic regret over user decision thresholds, where the weight emphasises those decision thresholds in the corresponding region of interest.

估計/估計量 · 相關系數 · 泛函 · 累積分布函數 · 概率密度函數 ·

2021 年 11 月 28 日

hermiter: R package for Sequential Nonparametric Estimation

Michael Stephanou,Melvin Varughese

from arxiv, 25 pages, 7 figures

This article introduces the R package hermiter which facilitates estimation of univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation coefficients (bivariate) using Hermite series based estimators. The algorithms implemented in the hermiter package are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. In addition, the Hermite series based estimators are approximately mergeable allowing decentralized estimation.

估計/估計量 · CASE · Extensibility · 估計誤差 · 點估計 ·

2021 年 11 月 28 日

A unified nonparametric fiducial approach to interval-censored data

Yifan Cui,Jan Hannig,Michael Kosorok

Censored data, where the event time is partially observed, are challenging for survival probability estimation. In this paper, we introduce a novel nonparametric fiducial approach to interval-censored data, including right-censored, current status, case II censored, and mixed case censored data. The proposed approach leveraging a simple Gibbs sampler has a useful property of being "one size fits all", i.e., the proposed approach automatically adapts to all types of non-informative censoring mechanisms. As shown in the extensive simulations, the proposed fiducial confidence intervals significantly outperform existing methods in terms of both coverage and length. In addition, the proposed fiducial point estimator has much smaller estimation errors than the nonparametric maximum likelihood estimator. Furthermore, we apply the proposed method to Austrian rubella data and a study of hemophiliacs infected with the human immunodeficiency virus. The strength of the proposed fiducial approach is not only estimation and uncertainty quantification but also its automatic adaptation to a variety of censoring mechanisms.

Continuity · 估計/估計量 · 統計量 · 極大似然 · 極大似然估計 ·

2021 年 11 月 27 日

Nonparametric estimation of continuous DPPs with kernel methods

Micha?l Fanuel,Rémi Bardenet

from arxiv, 26 pages, 7 figures. To appear at NeurIPS 2021

Determinantal Point Process (DPPs) are statistical models for repulsive point patterns. Both sampling and inference are tractable for DPPs, a rare feature among models with negative dependence that explains their popularity in machine learning and spatial statistics. Parametric and nonparametric inference methods have been proposed in the finite case, i.e. when the point patterns live in a finite ground set. In the continuous case, only parametric methods have been investigated, while nonparametric maximum likelihood for DPPs -- an optimization problem over trace-class operators -- has remained an open question. In this paper, we show that a restricted version of this maximum likelihood (MLE) problem falls within the scope of a recent representer theorem for nonnegative functions in an RKHS. This leads to a finite-dimensional problem, with strong statistical ties to the original MLE. Moreover, we propose, analyze, and demonstrate a fixed point algorithm to solve this finite-dimensional problem. Finally, we also provide a controlled estimate of the correlation kernel of the DPP, thus providing more interpretability.

估計/估計量 · 多重共線性 · 預測器/決策函數 · Performer · MoDELS ·

2021 年 11 月 27 日

Ridge-Type Shrinkage Estimators in Low and High Dimensional Beta Regression Model with Application in Econometrics and Medicine

Ejaz Ahmed,Reza Arabi Belaghi,Yasin Asar,Abdulkhadir Hussein

Beta regression model is useful in the analysis of bounded continuous outcomes such as proportions. It is well known that for any regression model, the presence of multicollinearity leads to poor performance of the maximum likelihood estimators. The ridge type estimators have been proposed to alleviate the adverse effects of the multicollinearity. Furthermore, when some of the predictors have insignificant or weak effects on the outcomes, it is desired to recover as much information as possible from these predictors instead of discarding them all together. In this paper we proposed ridge type shrinkage estimators for the low and high dimensional beta regression model, which address the above two issues simultaneously. We compute the biases and variances of the proposed estimators in closed forms and use Monte Carlo simulations to evaluate their performances. The results show that, both in low and high dimensional data, the performance of the proposed estimators are superior to ridge estimators that discard weak or insignificant predictors. We conclude this paper by applying the proposed methods for two real data from econometric and medicine.

估計/估計量 · 張成子空間 · Lipschitz連續 · 優化器 · 可辨認的 ·

2021 年 11 月 25 日

Tree density estimation

László Gy?rfi,Aryeh Kontorovich,Roi Weiss

We study the problem of density estimation for a random vector ${\boldsymbol X}$ in $\mathbb R^d$ with probability density $f(\boldsymbol x)$. For a spanning tree $T$ defined on the vertex set $\{1,\dots ,d\}$, the tree density $f_{T}$ is a product of bivariate conditional densities. The optimal spanning tree $T^*$ is the spanning tree $T$, for which the Kullback-Leibler divergence of $f$ and $f_{T}$ is the smallest. From i.i.d. data we identify the optimal tree $T^*$ and computationally efficiently construct a tree density estimate $f_n$ such that, without any regularity conditions on the density $f$, one has that $\lim_{n\to \infty} \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol x=0$ a.s. For Lipschitz continuous $f$ with bounded support, $\mathbb E\{ \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol x\}=O(n^{-1/4})$.

估計/估計量 · Networking · SimPLe · 優化器 · 有向 ·

2021 年 11 月 25 日

Efficient Semiparametric Estimation of Network Treatment Effects Under Partial Interference

Chan Park,Hyunseung Kang

Recently, many estimators for network treatment effects have been proposed. But, their optimality properties in terms of semiparametric efficiency have yet to be resolved. We present a simple, yet flexible asymptotic framework to derive the efficient influence function and the semiparametric efficiency lower bound for a family of network causal effects under partial interference. An important corollary of our results is that one of the existing estimators by Liu et al. (2019) is locally efficient. We also present other estimators that are efficient and discuss results on adaptive estimation. We conclude by using the efficient estimators to study the direct and spillover effects of conditional cash transfer programs in Colombia.

估計/估計量 · 有偏 · Extensibility · INFORMS · 規范化的 ·

2021 年 11 月 24 日

Nonparametric Empirical Bayes Estimation on Heterogeneous Data

Trambak Banerjee,Luella J. Fu,Gareth M. James,Wenguang Sun

from arxiv, 66 pages including 33 pages of main text, 5 pages of bibliography, and 29 pages of supplementary text

The simultaneous estimation of many parameters based on data collected from corresponding studies is a key research problem that has received renewed attention in the high-dimensional setting. Many practical situations involve heterogeneous data where heterogeneity is captured by a nuisance parameter. Effectively pooling information across samples while correctly accounting for heterogeneity presents a significant challenge in large-scale estimation problems. We address this issue by introducing the "Nonparametric Empirical Bayes Structural Tweedie" (NEST) estimator, which efficiently estimates the unknown effect sizes and properly adjusts for heterogeneity via a generalized version of Tweedie's formula. For the normal means problem, NEST simultaneously handles the two main selection biases introduced by heterogeneity: one, the selection bias in the mean, which cannot be effectively corrected without also correcting for, two, selection bias in the variance. Our theoretical results show that NEST has strong asymptotic properties without requiring explicit assumptions about the prior. Extensions to other two-parameter members of the exponential family are discussed. Simulation studies show that NEST outperforms competing methods, with much efficiency gains in many settings. The proposed method is demonstrated on estimating the batting averages of baseball players and Sharpe ratios of mutual fund returns.

優化器 · 近鄰 · Performer · 邊緣化 · 可行 ·

2018 年 5 月 2 日

Feasibility Based Large Margin Nearest Neighbor Metric Learning

Babak Hosseini,Barbara Hammer

from arxiv, This is the preprint of the conference paper published in ESANN2018

Large margin nearest neighbor (LMNN) is a metric learner which optimizes the performance of the popular $k$NN classifier. However, its resulting metric relies on pre-selected target neighbors. In this paper, we address the feasibility of LMNN's optimization constraints regarding these target points, and introduce a mathematical measure to evaluate the size of the feasible region of the optimization problem. We enhance the optimization framework of LMNN by a weighting scheme which prefers data triplets which yield a larger feasible region. This increases the chances to obtain a good metric as the solution of LMNN's problem. We evaluate the performance of the resulting feasibility-based LMNN algorithm using synthetic and real datasets. The empirical results show an improved accuracy for different types of datasets in comparison to regular LMNN.