亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In randomized experiments, adjusting for observed features when estimating treatment effects has been proposed as a way to improve asymptotic efficiency. However, only linear regression has been proven to form an estimate of the average treatment effect that is asymptotically no less efficient than the treated-minus-control difference in means regardless of the true data generating process. Randomized treatment assignment provides this "do-no-harm" property, with neither truth of a linear model nor a generative model for the outcomes being required. We present a general calibration method which confers the same no-harm property onto estimators leveraging a broad class of nonlinear models. This recovers the usual regression-adjusted estimator when ordinary least squares is used, and further provides non-inferior treatment effect estimators using methods such as logistic and Poisson regression. The resulting estimators are non-inferior to both the difference in means estimator and to treatment effect estimators that have not undergone calibration. We show that our estimator is asymptotically equivalent to an inverse probability weighted estimator using a logit link with predicted potential outcomes as covariates. In a simulation study, we demonstrate that common nonlinear estimators without our calibration procedure may perform markedly worse than both the calibrated estimator and the unadjusted difference in means.

相關內容

Standard Monte Carlo computation is widely known to exhibit a canonical square-root convergence speed in terms of sample size. Two recent techniques, one based on control variate and one on importance sampling, both derived from an integration of reproducing kernels and Stein's identity, have been proposed to reduce the error in Monte Carlo computation to supercanonical convergence. This paper presents a more general framework to encompass both techniques that is especially beneficial when the sample generator is biased and noise-corrupted. We show our general estimator, which we call the doubly robust Stein-kernelized estimator, outperforms both existing methods in terms of convergence rates across different scenarios. We also demonstrate the superior performance of our method via numerical examples.

We consider an elliptic linear-quadratic parameter estimation problem with a finite number of parameters. An adaptive finite element method driven by an a posteriori error estimator for the error in the parameters is presented. Unlike prior results in the literature, our estimator, which is composed of standard energy error residual estimators for the state equation and suitable co-state problems, reflects the faster convergence of the parameter error compared to the (co)-state variables. We show optimal convergence rates of our method; in particular and unlike prior works, we prove that the estimator decreases with a rate that is the sum of the best approximation rates of the state and co-state variables. Experiments confirm that our method matches the convergence rate of the parameter error.

We propose a doubly robust approach to characterizing treatment effect heterogeneity in observational studies. We utilize posterior distributions for both the propensity score and outcome regression models to provide valid inference on the conditional average treatment effect even when high-dimensional or nonparametric models are used. We show that our approach leads to conservative inference in finite samples or under model misspecification, and provides a consistent variance estimator when both models are correctly specified. In simulations, we illustrate the utility of these results in difficult settings such as high-dimensional covariate spaces or highly flexible models for the propensity score and outcome regression. Lastly, we analyze environmental exposure data from NHANES to identify how the effects of these exposures vary by subject-level characteristics.

We consider a stochastic bandit problem with a possibly infinite number of arms. We write $p^*$ for the proportion of optimal arms and $\Delta$ for the minimal mean-gap between optimal and sub-optimal arms. We characterize the optimal learning rates both in the cumulative regret setting, and in the best-arm identification setting in terms of the problem parameters $T$ (the budget), $p^*$ and $\Delta$. For the objective of minimizing the cumulative regret, we provide a lower bound of order $\Omega(\log(T)/(p^*\Delta))$ and a UCB-style algorithm with matching upper bound up to a factor of $\log(1/\Delta)$. Our algorithm needs $p^*$ to calibrate its parameters, and we prove that this knowledge is necessary, since adapting to $p^*$ in this setting is impossible. For best-arm identification we also provide a lower bound of order $\Omega(\exp(-cT\Delta^2 p^*))$ on the probability of outputting a sub-optimal arm where $c>0$ is an absolute constant. We also provide an elimination algorithm with an upper bound matching the lower bound up to a factor of order $\log(T)$ in the exponential, and that does not need $p^*$ or $\Delta$ as parameter. Our results apply directly to the three related problems of competing against the $j$-th best arm, identifying an $\epsilon$ good arm, and finding an arm with mean larger than a quantile of a known order.

Perturb-and-MAP offers an elegant approach to approximately sample from a energy-based model (EBM) by computing the maximum-a-posteriori (MAP) configuration of a perturbed version of the model. Sampling in turn enables learning. However, this line of research has been hindered by the general intractability of the MAP computation. Very few works venture outside tractable models, and when they do, they use linear programming approaches, which as we will show, have several limitations. In this work we present perturb-and-max-product (PMP), a parallel and scalable mechanism for sampling and learning in discrete EBMs. Models can be arbitrary as long as they are built using tractable factors. We show that (a) for Ising models, PMP is orders of magnitude faster than Gibbs and Gibbs-with-Gradients (GWG) at learning and generating samples of similar or better quality; (b) PMP is able to learn and sample from RBMs; (c) in a large, entangled graphical model in which Gibbs and GWG fail to mix, PMP succeeds.

In this article, we study the problem of high-dimensional conditional independence testing, a key building block in statistics and machine learning. We propose an inferential procedure based on double generative adversarial networks (GANs). Specifically, we first introduce a double GANs framework to learn two generators of the conditional distributions. We then integrate the two generators to construct a test statistic, which takes the form of the maximum of generalized covariance measures of multiple transformation functions. We also employ data-splitting and cross-fitting to minimize the conditions on the generators to achieve the desired asymptotic properties, and employ multiplier bootstrap to obtain the corresponding $p$-value. We show that the constructed test statistic is doubly robust, and the resulting test both controls type-I error and has the power approaching one asymptotically. Also notably, we establish those theoretical guarantees under much weaker and practically more feasible conditions compared to the existing tests, and our proposal gives a concrete example of how to utilize some state-of-the-art deep learning tools, such as GANs, to help address a classical but challenging statistical problem. We demonstrate the efficacy of our test through both simulations and an application to an anti-cancer drug dataset. A Python implementation of the proposed procedure is available at //github.com/tianlinxu312/dgcit.

During the COVID-19 pandemic, governments and public health authorities have used seroprevalence studies to estimate the proportion of persons within a given population who have antibodies to SARS-CoV-2. Seroprevalence is crucial for estimating quantities such as the infection fatality ratio, proportion of asymptomatic cases, and differences in infection rates across population subgroups. However, serologic assays are prone to false positives and negatives, and non-probability sampling methods may induce selection bias. In this paper, we consider nonparametric and parametric seroprevalence estimators that address both challenges by leveraging validation data and assuming equal probabilities of sample inclusion within covariate-defined strata. Both estimators are shown to be consistent and asymptotically normal, and consistent variance estimators are derived. Simulation studies are presented comparing the finite sample performance of the estimators over a range of assay characteristics and sampling scenarios. The methods are used to estimate SARS-CoV-2 seroprevalence in asymptomatic individuals in Belgium and North Carolina.

This paper focuses on learning rate analysis of distributed kernel ridge regression for strong mixing sequences. Using a recently developed integral operator approach and a classical covariance inequality for Banach-valued strong mixing sequences, we succeed in deriving optimal learning rate for distributed kernel ridge regression. As a byproduct, we also deduce a sufficient condition for the mixing property to guarantee the optimal learning rates for kernel ridge regression. Our results extend the applicable range of distributed learning from i.i.d. samples to non-i.i.d. sequences.

Large margin nearest neighbor (LMNN) is a metric learner which optimizes the performance of the popular $k$NN classifier. However, its resulting metric relies on pre-selected target neighbors. In this paper, we address the feasibility of LMNN's optimization constraints regarding these target points, and introduce a mathematical measure to evaluate the size of the feasible region of the optimization problem. We enhance the optimization framework of LMNN by a weighting scheme which prefers data triplets which yield a larger feasible region. This increases the chances to obtain a good metric as the solution of LMNN's problem. We evaluate the performance of the resulting feasibility-based LMNN algorithm using synthetic and real datasets. The empirical results show an improved accuracy for different types of datasets in comparison to regular LMNN.

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.

北京阿比特科技有限公司