黄片一级在线视频播放_国产成人无码午夜视频在线观看_欧美日韩一区二区_国产A级一级久久毛片_美女操逼特级黄片_亚洲国产精品第一页久久婷婷_国产A级理论片无码久久

The occurrence of Simpson's paradox (SP) in $2\times 2$ contingency tables has been well studied. The present work comprehensively revisits this problem using a combination of philosophical reflections, causal considerations, and probability theory. The first contribution is to provide a schematic analysis of SP in $2\times 2$ contingency tables and present new results, detailed proofs of previous results and a unifying view of the important examples of SP that have been reported in the literature. The second contribution of the paper suggests a new perspective on the surprise element of SP, raises some critical questions regarding the influential causal analyses of SP and provides a broad perspective on logic, probability, and statistics with SP at its center. The upshot of this research is that we need both causal concepts and statistical tools coupled with philosophical analyses to sort out issues regarding SP.

相關內容

統計(ji)量

關注 3

SAT · Extensibility · 可約的 · 易處理的 · 推斷 ·

2021 年 11 月 15 日

MAJORITY-3SAT (and Related Problems) in Polynomial Time

Shyan Akmal,Ryan Williams

from arxiv, Abstract shortened to fit arXiv requirements

Majority-SAT is the problem of determining whether an input $n$-variable formula in conjunctive normal form (CNF) has at least $2^{n-1}$ satisfying assignments. Majority-SAT and related problems have been studied extensively in various AI communities interested in the complexity of probabilistic planning and inference. Although Majority-SAT has been known to be PP-complete for over 40 years, the complexity of a natural variant has remained open: Majority-$k$SAT, where the input CNF formula is restricted to have clause width at most $k$. We prove that for every $k$, Majority-$k$SAT is in P. In fact, for any positive integer $k$ and rational $\rho \in (0,1)$ with bounded denominator, we give an algorithm that can determine whether a given $k$-CNF has at least $\rho \cdot 2^n$ satisfying assignments, in deterministic linear time (whereas the previous best-known algorithm ran in exponential time). Our algorithms have interesting positive implications for counting complexity and the complexity of inference, significantly reducing the known complexities of related problems such as E-MAJ-$k$SAT and MAJ-MAJ-$k$SAT. At the heart of our approach is an efficient method for solving threshold counting problems by extracting sunflowers found in the corresponding set system of a $k$-CNF. We also show that the tractability of Majority-$k$SAT is somewhat fragile. For the closely related GtMajority-SAT problem (where we ask whether a given formula has greater than $2^{n-1}$ satisfying assignments) which is known to be PP-complete, we show that GtMajority-$k$SAT is in P for $k\le 3$, but becomes NP-complete for $k\geq 4$. These results are counterintuitive, because the ``natural'' classifications of these problems would have been PP-completeness, and because there is a stark difference in the complexity of GtMajority-$k$SAT and Majority-$k$SAT for all $k\ge 4$.

統計量 · 樣例 · PDE · 穩健性 · 相似度 ·

2021 年 11 月 15 日

Theoretical Guarantees for the Statistical Finite Element Method

Yanni Papandreou,Jon Cockayne,Mark Girolami,Andrew B. Duncan

from arxiv, 27 pages for main article, 11 pages for supplement, 8 figures

The statistical finite element method (StatFEM) is an emerging probabilistic method that allows observations of a physical system to be synthesised with the numerical solution of a PDE intended to describe it in a coherent statistical framework, to compensate for model error. This work presents a new theoretical analysis of the statistical finite element method demonstrating that it has similar convergence properties to the finite element method on which it is based. Our results constitute a bound on the Wasserstein-2 distance between the ideal prior and posterior and the StatFEM approximation thereof, and show that this distance converges at the same mesh-dependent rate as finite element solutions converge to the true solution. Several numerical examples are presented to demonstrate our theory, including an example which test the robustness of StatFEM when extended to nonlinear quantities of interest.

估計/估計量 · CASES · 推斷 · 穩健性 · Continuity ·

2021 年 11 月 14 日

Theory of Low Frequency Contamination from Nonstationarity and Misspecification: Consequences for HAR Inference

Alessandro Casini,Taosong Deng,Pierre Perron

We establish theoretical results about the low frequency contamination (i.e., long memory effects) induced by general nonstationarity for estimates such as the sample autocovariance and the periodogram, and deduce consequences for heteroskedasticity and autocorrelation robust (HAR) inference. We present explicit expressions for the asymptotic bias of these estimates. We distinguish cases where this contamination only occurs as a small-sample problem and cases where the contamination continues to hold asymptotically. We show theoretically that nonparametric smoothing over time is robust to low frequency contamination. Our results provide new insights on the debate between consistent versus inconsistent long-run variance (LRV) estimation. Existing LRV estimators tend to be in inflated when the data are nonstationary. This results in HAR tests that can be undersized and exhibit dramatic power losses. Our theory indicates that long bandwidths or fixed-b HAR tests suffer more from low frequency contamination relative to HAR tests based on HAC estimators, whereas recently introduced double kernel HAC estimators do not super from this problem. Finally, we present second-order Edgeworth expansions under nonstationarity about the distribution of HAC and DK-HAC estimators and about the corresponding t-test in the linear regression model.

視覺識別系統 · 向量空間 · Integration · 推斷 · 可辨認的 ·

2021 年 11 月 13 日

Variational Inference with Holder Bounds

Junya Chen,Danni Lu,Zidi Xiu,Ke Bai,Lawrence Carin,Chenyang Tao

The recent introduction of thermodynamic integration techniques has provided a new framework for understanding and improving variational inference (VI). In this work, we present a careful analysis of the thermodynamic variational objective (TVO), bridging the gap between existing variational objectives and shedding new insights to advance the field. In particular, we elucidate how the TVO naturally connects the three key variational schemes, namely the importance-weighted VI, Renyi-VI, and MCMC-VI, which subsumes most VI objectives employed in practice. To explain the performance gap between theory and practice, we reveal how the pathological geometry of thermodynamic curves negatively affects TVO. By generalizing the integration path from the geometric mean to the weighted Holder mean, we extend the theory of TVO and identify new opportunities for improving VI. This motivates our new VI objectives, named the Holder bounds, which flatten the thermodynamic curves and promise to achieve a one-step approximation of the exact marginal log-likelihood. A comprehensive discussion on the choices of numerical estimators is provided. We present strong empirical evidence on both synthetic and real-world datasets to support our claims.

估計/估計量 · 樣例 · 塑造 · Better · contrastive ·

2021 年 11 月 12 日

Histograms lie about distribution shapes and Pearson's coefficient of variation lies about variability

Paulo S. P. Silveira,Jose O. Siqueira

from arxiv, 24 pages, 7 figures, Rscripts included along text and apendices. This manuscript is under consideration of TQMP (The Quantitative Methods for Psychology \url{//www.tqmp.org}) since 28Oct2021

Background and Objective: Histograms and Pearson's coefficient of variation are among the most popular summary statistics. Researchers use them to judge the shape of quantitative data distribution by visual inspection of histograms. The coefficient of variation is taken as an estimator of relative variability of these data. We explore properties of histograms and coefficient of variation by examples in R, thus offering better alternatives: density plots and Eisenhauer's relative dispersion coefficient. Methods: Hypothetical examples developed in R are applied to create histograms and density and to compute coefficient of variation and relative dispersion coefficient. Results: These hypothetical examples clearly show that these two traditional approaches are flawed. Histograms are incapable of reflecting the distribution of probabilities and the coefficient of variation has issues with negative and positive values in the same dataset, it is sensible to outliers, and it is severely affected by mean value of a distribution. Potential replacements are explained and applied for contrast. Conclusions: With the use of modern computers and R language it is easy to replace histograms by density plots, which are able to approximate the theoretical probability distribution. In addition, Eisenhauer's relative dispersion coefficient is suggested as a suitable estimator of relative variability, including corrections for lower and upper bounds.

統計量 · 查準率/準確率 · 模型評估 · 穩健性 · CASE ·

2021 年 11 月 12 日

Accuracy, precision, and agreement statistical tests for Bland-Altman method

P. S. P. Silveira,J. E. Vieira,A. A. Ferraro,J. O. Siqueira

from arxiv, Previous version was rejected by Clinics (//www.clinicsjournal.com/). The current version was submitted today (12Nov2021) to BMC Medical Research Methodology

Background: Bland and Altman plot method is a widely cited graphical approach to assess equivalence of quantitative measurement techniques. It has been widely applied, however often misinterpreted by lacking of inferential statistical support. We aim to develop and distribute a statistical method in R in order to add robust and suitable inferential statistics of equivalence. Methods: Three nested tests based on structural regressions are proposed to assess the equivalence of structural means (accuracy), equivalence of structural variances (precision), and concordance with the structural bisector line (agreement in measurements of data pairs obtained from the same subject) to reach statistical support for the equivalence of measurement techniques. Graphical outputs illustrating these three tests were added to follow Bland and Altman's principles of easy communication. Results: Statistical p-values and robust approach by bootstrapping with corresponding graphs provide objective, robust measures of equivalence. Five pairs of data sets were analyzed in order to criticize previously published articles that applied the Bland and Altman's principles, thus showing the suitability of the present statistical approach. In one case it was demonstrated strict equivalence, three cases showed partial equivalence, and one case showed poor equivalence. Package containing open codes and data is available with installation instructions on GitHub for free distribution. Conclusions: Statistical p-values and robust approach assess the equivalence of accuracy, precision, and agreement for measurement techniques. Decomposition in three tests helps the location of any disagreement as a means to fix a new technique.

測試誤差 · 有偏 · Networking · Neural Networks · 學成 ·

2021 年 11 月 12 日

A Random Matrix Perspective on Mixtures of Nonlinearities for Deep Learning

Ben Adlam,Jake Levinson,Jeffrey Pennington

One of the distinguishing characteristics of modern deep learning systems is that they typically employ neural network architectures that utilize enormous numbers of parameters, often in the millions and sometimes even in the billions. While this paradigm has inspired significant research on the properties of large networks, relatively little work has been devoted to the fact that these networks are often used to model large complex datasets, which may themselves contain millions or even billions of constraints. In this work, we focus on this high-dimensional regime in which both the dataset size and the number of features tend to infinity. We analyze the performance of random feature regression with features $F=f(WX+B)$ for a random weight matrix $W$ and random bias vector $B$, obtaining exact formulae for the asymptotic training and test errors for data generated by a linear teacher model. The role of the bias can be understood as parameterizing a distribution over activation functions, and our analysis directly generalizes to such distributions, even those not expressible with a traditional additive bias. Intriguingly, we find that a mixture of nonlinearities can improve both the training and test errors over the best single nonlinearity, suggesting that mixtures of nonlinearities might be useful for approximate kernel methods or neural network architecture design.

INFORMS · 置信度 · 奇異的 · Fisher信息矩陣 · Extensibility ·

2021 年 11 月 12 日

Confidence Regions Near Singular Information and Boundary Points With Applications to Mixed Models

Karl Oskar Ekvall,Matteo Bottai

We propose confidence regions with asymptotically correct uniform coverage probability of parameters whose Fisher information matrix can be singular at important points of the parameter set. Our work is motivated by the need for reliable inference on scale parameters close or equal to zero in mixed models, which is obtained as a special case. The confidence regions are constructed by inverting a continuous extension of the score test statistic standardized by expected information, which we show exists at points of singular information under regularity conditions. Similar results have previously only been obtained for scalar parameters, under conditions stronger than ours, and applications to mixed models have not been considered. In simulations our confidence regions have near-nominal coverage with as few as $n = 20$ independent observations, regardless of how close to the boundary the true parameter is. It is a corollary of our main results that the proposed test statistic has an asymptotic chi-square distribution with degrees of freedom equal to the number of tested parameters, even if they are on the boundary of the parameter set.

統計量 · 穩健性 · 估計/估計量 · 統計效率 · 樣本復雜度 ·

2021 年 11 月 12 日

Differential privacy and robust statistics in high dimensions

Xiyang Liu,Weihao Kong,Sewoong Oh

We introduce a universal framework for characterizing the statistical efficiency of a statistical estimation problem with differential privacy guarantees. Our framework, which we call High-dimensional Propose-Test-Release (HPTR), builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. Gluing all these together is the concept of resilience, which is central to robust statistical estimation. Resilience guides the design of the algorithm, the sensitivity analysis, and the success probability analysis of the test step in Propose-Test-Release. The key insight is that if we design an exponential mechanism that accesses the data only via one-dimensional robust statistics, then the resulting local sensitivity can be dramatically reduced. Using resilience, we can provide tight local sensitivity bounds. These tight bounds readily translate into near-optimal utility guarantees in several cases. We give a general recipe for applying HPTR to a given instance of a statistical estimation problem and demonstrate it on canonical problems of mean estimation, linear regression, covariance estimation, and principal component analysis. We introduce a general utility analysis technique that proves that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.

可辨認的 · 圖 · 規范化的 · INFORMS · 知識圖譜 ·

2020 年 3 月 23 日

What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization

Caleb Belth,Xinyi Zheng,Jilles Vreeken,Danai Koutra

from arxiv, 10 pages, plus 2 pages of references. 5 figures. Accepted at The Web Conference 2020

Knowledge graphs (KGs) store highly heterogeneous information about the world in the structure of a graph, and are useful for tasks such as question answering and reasoning. However, they often contain errors and are missing information. Vibrant research in KG refinement has worked to resolve these issues, tailoring techniques to either detect specific types of errors or complete a KG. In this work, we introduce a unified solution to KG characterization by formulating the problem as unsupervised KG summarization with a set of inductive, soft rules, which describe what is normal in a KG, and thus can be used to identify what is abnormal, whether it be strange or missing. Unlike first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns that describe the expected neighborhood around a (seen or unseen) node, based on its type, and information in the KG. Stepping away from the traditional support/confidence-based rule mining techniques, we propose KGist, Knowledge Graph Inductive SummarizaTion, which learns a summary of inductive rules that best compress the KG according to the Minimum Description Length principle---a formulation that we are the first to use in the context of KG rule mining. We apply our rules to three large KGs (NELL, DBpedia, and Yago), and tasks such as compression, various types of error detection, and identification of incomplete information. We show that KGist outperforms task-specific, supervised and unsupervised baselines in error detection and incompleteness identification, (identifying the location of up to 93% of missing entities---over 10% more than baselines), while also being efficient for large knowledge graphs.