免费在线黄色电影,在线成人H视频免费,国产精品亚洲AⅤ片

from arxiv, 38 pages. Advised by Prof. Alexis Diamond. This paper is supplemented by a Python library, SyntheticControlMethods (//github.com/OscarEngelbrektson/SyntheticControlMethods), implementing all methods covered in this paper, including the original synthetic control method. As of March 20, 2021 the package has been downloaded 12,424 times

This paper extends the literature on the theoretical properties of synthetic controls to the case of non-linear generative models, showing that the synthetic control estimator is generally biased in such settings. I derive a lower bound for the bias, showing that the only component of it that is affected by the choice of synthetic control is the weighted sum of pairwise differences between the treated unit and the untreated units in the synthetic control. To address this bias, I propose a novel synthetic control estimator that allows for a constant difference of the synthetic control to the treated unit in the pre-treatment period, and that penalizes the pairwise discrepancies. Allowing for a constant offset makes the model more flexible, thus creating a larger set of potential synthetic controls, and the penalization term allows for the selection of the potential solution that will minimize bias. I study the properties of this estimator and propose a data-driven process for parameterizing the penalization term.

相關內容

估計/估計量

關注 3

統計量 · Performer · 推斷 · 置信度 · 方差 ·

2022 年 1 月 27 日

Interpretation and inference for altmetric indicators arising from sparse data statistics

Lawrence Smolinsky,Bernhard Klingenberg,Brian D. Marx

from arxiv, To appear in the Journal of Informetrics

In 2018 Bornmann and Haunschild (2018a) introduced a new indicator called the Mantel-Haenszel quotient (MHq) to measure alternative metrics (or altmetrics) of scientometric data. In this article we review the Mantel-Haenszel statistics, point out two errors in the literature, and introduce a new indicator. First, we correct the interpretation of MHq and mention that it is still a meaningful indicator. Second, we correct the variance formula for MHq, which leads to narrower confidence intervals. A simulation study shows the superior performance of our variance estimator and confidence intervals. Since MHq does not match its original description in the literature, we propose a new indicator, the Mantel-Haenszel row risk ratio (MHRR), to meet that need. Interpretation and statistical inference for MHRR are discussed. For both MHRR and MHq, a value greater (less) than one means performance is better (worse) than in the reference set called the world.

平滑 · 線性的 · 估計/估計量 · MoDELS · 可理解性 ·

2022 年 1 月 26 日

Local linear smoothing in additive models as data projection

Munir Hiabu,Enno Mammen,Joseph T. Meyer

We discuss local linear smooth backfitting for additive non-parametric models. This procedure is well known for achieving optimal convergence rates under appropriate smoothness conditions. In particular, it allows for the estimation of each component of an additive model with the same asymptotic accuracy as if the other components were known. The asymptotic discussion of local linear smooth backfitting is rather complex because typically an overwhelming notation is required for a detailed discussion. In this paper we interpret the local linear smooth backfitting estimator as a projection of the data onto a linear space with a suitably chosen semi-norm. This approach simplifies both the mathematical discussion as well as the intuitive understanding of properties of this version of smooth backfitting.

估計/估計量 · 優化器 · 得分 · 社區發現 · 特化 ·

2022 年 1 月 26 日

Optimal Estimation of the Number of Communities

Jiashun Jin,Zheng Tracy Ke,Shengming Luo,Minzhe Wang

In network analysis, how to estimate the number of communities $K$ is a fundamental problem. We consider a broad setting where we allow severe degree heterogeneity and a wide range of sparsity levels, and propose Stepwise Goodness-of-Fit (StGoF) as a new approach. This is a stepwise algorithm, where for $m = 1, 2, \ldots$, we alternately use a community detection step and a goodness-of-fit (GoF) step. We adapt SCORE \cite{SCORE} for community detection, and propose a new GoF metric. We show that at step $m$, the GoF metric diverges to $\infty$ in probability for all $m < K$ and converges to $N(0,1)$ if $m = K$. This gives rise to a consistent estimate for $K$. Also, we discover the right way to define the signal-to-noise ratio (SNR) for our problem and show that consistent estimates for $K$ do not exist if $\mathrm{SNR} \goto 0$, and StGoF is uniformly consistent for $K$ if $\mathrm{SNR} \goto \infty$. Therefore, StGoF achieves the optimal phase transition. Similar stepwise methods (e.g., \cite{wang2017likelihood, ma2018determining}) are known to face analytical challenges. We overcome the challenges by using a different stepwise scheme in StGoF and by deriving sharp results that are not available before. The key to our analysis is to show that SCORE has the {\it Non-Splitting Property (NSP)}. Primarily due to a non-tractable rotation of eigenvectors dictated by the Davis-Kahan $\sin(\theta)$ theorem, the NSP is non-trivial to prove and requires new techniques we develop.

MoDELS · Processing（編程語言） · 無監督學習 · 穩健性 · 無監督 ·

2022 年 1 月 25 日

Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics

Ryan Giordano,Runjing Liu,Michael I. Jordan,Tamara Broderick

from arxiv, 69 pages. Accepted for submission at Bayesian Analysis

Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks. However, due to the flexibility of these models, the consequences of prior choices can be opaque. And so prior specification can be relatively difficult. At the same time, prior choice can have a substantial effect on posterior inferences. Thus, considerations of robustness need to go hand in hand with nonparametric modeling. In the current paper, we tackle this challenge by exploiting the fact that variational Bayesian methods, in addition to having computational advantages in fitting complex nonparametric models, also yield sensitivities with respect to parametric and nonparametric aspects of Bayesian models. In particular, we demonstrate how to assess the sensitivity of conclusions to the choice of concentration parameter and stick-breaking distribution for inferences under Dirichlet process mixtures and related mixture models. We provide both theoretical and empirical support for our variational approach to Bayesian sensitivity analysis.

估計/估計量 · 經驗熵 · 矩 · Continuity · INFORMS ·

2022 年 1 月 24 日

Dimension-Free Empirical Entropy Estimation

Doron Cohen,Aryeh Kontorovich,Aaron Koolyk,Geoffrey Wolfer

We seek an entropy estimator for discrete distributions with fully empirical accuracy bounds. As stated, this goal is infeasible without some prior assumptions on the distribution. We discover that a certain information moment assumption renders the problem feasible. We argue that the moment assumption is natural and, in some sense, {\em minimalistic} -- weaker than finite support or tail decay conditions. Under the moment assumption, we provide the first finite-sample entropy estimates for infinite alphabets, nearly recovering the known minimax rates. Moreover, we demonstrate that our empirical bounds are significantly sharper than the state-of-the-art bounds, for various natural distributions and non-trivial sample regimes. Along the way, we give a dimension-free analogue of the Cover-Thomas result on entropy continuity (with respect to total variation distance) for finite alphabets, which may be of independent interest. Additionally, we resolve all of the open problems posed by J\"urgensen and Matthews, 2010.

估計/估計量 · 優化器 · 衰減系數 · Continuity · 相互獨立的 ·

2022 年 1 月 23 日

On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation

Xiaohong Chen,Zhengling Qi

We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. We recast the $Q$-function estimation into a special form of the nonparametric instrumental variables (NPIV) estimation problem. We first show that under one mild condition the NPIV formulation of $Q$-function estimation is well-posed in the sense of $L^2$-measure of ill-posedness with respect to the data generating distribution, bypassing a strong assumption on the discount factor $\gamma$ imposed in the recent literature for obtaining the $L^2$ convergence rates of various $Q$-function estimators. Thanks to this new well-posed property, we derive the first minimax lower bounds for the convergence rates of nonparametric estimation of $Q$-function and its derivatives in both sup-norm and $L^2$-norm, which are shown to be the same as those for the classical nonparametric regression (Stone, 1982). We then propose a sieve two-stage least squares estimator and establish its rate-optimality in both norms under some mild conditions. Our general results on the well-posedness and the minimax lower bounds are of independent interest to study not only other nonparametric estimators for $Q$-function but also efficient estimation on the value of any target policy in off-policy settings.

估計/估計量 · 近似 · 混合分布 · 邊緣化 · 混合 ·

2022 年 1 月 22 日

Stochastic Approximation Algorithm for Estimating Mixing Distribution for Dependent Observations

Nilabja Guha,Anindya Roy

Estimating the mixing density of a mixture distribution remains an interesting problem in statistics literature. Using a stochastic approximation method, Newton and Zhang (1999) introduced a fast recursive algorithm for estimating the mixing density of a mixture. Under suitably chosen weights the stochastic approximation estimator converges to the true solution. In Tokdar et. al. (2009) the consistency of this recursive estimation method was established. However, the proof of consistency of the resulting estimator used independence among observations as an assumption. Here, we extend the investigation of performance of Newton's algorithm to several dependent scenarios. We first prove that the original algorithm under certain conditions remains consistent when the observations are arising form a weakly dependent process with fixed marginal with the target mixture as the marginal density. For some of the common dependent structures where the original algorithm is no longer consistent, we provide a modification of the algorithm that generates a consistent estimator.

似然 · 估計/估計量 · 近似 · Continuity · Processing（編程語言） ·

2022 年 1 月 21 日

Asymptotic approximation of the likelihood of stationary determinantal point processes

Arnaud Poinas,Frédéric Lavancier

Continuous determinantal point processes (DPPs) are a class of repulsive point processes on $\mathbb{R}^d$ with many statistical applications. Although an explicit expression of their density is known, it is too complicated to be used directly for maximum likelihood estimation. In the stationary case, an approximation using Fourier series has been suggested, but it is limited to rectangular observation windows and no theoretical results support it. In this contribution, we investigate a different way to approximate the likelihood by looking at its asymptotic behaviour when the observation window grows towards $\mathbb{R}^d$. This new approximation is not limited to rectangular windows, is faster to compute than the previous one, does not require any tuning parameter, and some theoretical justifications are provided. It moreover provides an explicit formula for estimating the asymptotic variance of the associated estimator. The performances are assessed in a simulation study on standard parametric models on $\mathbb{R}^d$ and compare favourably to common alternative estimation methods for continuous DPPs.

MoDELS · 控制器 · 生成模型 · 樣本 · 聯合分布 ·

2021 年 10 月 21 日

Controllable and Compositional Generation with Latent-Space Energy-Based Models

Weili Nie,Arash Vahdat,Anima Anandkumar

from arxiv, 32 pages, NeurIPS 2021

Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications, but it still remains as a great challenge. In particular, the compositional ability to generate novel concept combinations is out of reach for most current models. In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes. To make them scalable to high-resolution image generation, we introduce an EBM in the latent space of a pre-trained generative model such as StyleGAN. We propose a novel EBM formulation representing the joint distribution of data and attributes together, and we show how sampling from it is formulated as solving an ordinary differential equation (ODE). Given a pre-trained generator, all we need for controllable generation is to train an attribute classifier. Sampling with ODEs is done efficiently in the latent space and is robust to hyperparameters. Thus, our method is simple, fast to train, and efficient to sample. Experimental results show that our method outperforms the state-of-the-art in both conditional sampling and sequential editing. In compositional generation, our method excels at zero-shot generation of unseen attribute combinations. Also, by composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.

估計/估計量 · 估計誤差 · MoDELS · 學成 · 無偏 ·

2020 年 12 月 17 日

The Causal Learning of Retail Delinquency

Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.