青柠在线观看免费高清1,99视频在线播放喷射,中文字幕AV一区二区三区亭亭色

For testing conditional independence (CI) of a response Y and a predictor X given covariates Z, the recently introduced model-X (MX) framework has been the subject of active methodological research, especially in the context of MX knockoffs and their successful application to genome-wide association studies. In this paper, we study the power of MX CI tests, yielding quantitative explanations for empirically observed phenomena and novel insights to guide the design of MX methodology. We show that any valid MX CI test must also be valid conditionally on Y and Z; this conditioning allows us to reformulate the problem as testing a point null hypothesis involving the conditional distribution of X. The Neyman-Pearson lemma then implies that the conditional randomization test (CRT) based on a likelihood statistic is the most powerful MX CI test against a point alternative. We also obtain a related optimality result for MX knockoffs. Switching to an asymptotic framework with arbitrarily growing covariate dimension, we derive an expression for the limiting power of the CRT against local semiparametric alternatives in terms of the prediction error of the machine learning algorithm on which its test statistic is based. Finally, we exhibit a resampling-free test with uniform asymptotic Type-I error control under the assumption that only the first two moments of X given Z are known, a significant relaxation of the MX assumption.

相關內容

條件獨立的

關注 0

泛函 · CASE · 規范化的 · 閉式 · 線性的 ·

2021 年 9 月 24 日

On the representation of non-holonomic power series

Bertrand Teguia Tabuguia,Wolfram Koepf

from arxiv, 16 pages; 23 references. Update: correction of typos from the previous version

Holonomic functions play an essential role in Computer Algebra since they allow the application of many symbolic algorithms. Among all algorithmic attempts to find formulas for power series, the holonomic property remains the most important requirement to be satisfied by the function under consideration. The targeted functions mainly summarize that of meromorphic functions. However, expressions like $\tan(z)$, $z/(\exp(z)-1)$, $\sec(z)$, etc. are not holonomic, therefore their power series are inaccessible by non-pattern matching implementations like the current Maple \texttt{convert/FormalPowerSeries}. From the mathematical dictionaries, one can observe that most of the known closed-form formulas of non-holonomic power series involve another sequence whose evaluation depends on some finite summations. In the case of $\tan(z)$ and $\sec(z)$ the corresponding sequences are the Bernoulli and Euler numbers, respectively. Thus providing a symbolic approach that yields complete representations when linear summations for power series coefficients of non-holonomic functions appear, might be seen as a step forward towards the representation of non-holonomic power series. By adapting the method of ansatz with undetermined coefficients, we build an algorithm that computes least-order quadratic differential equations with polynomial coefficients for a large class of non-holonomic functions. A differential equation resulting from this procedure is converted into a recurrence equation by applying the Cauchy product formula and rewriting powers into polynomials and derivatives into shifts. Finally, using enough initial values we are able to give normal form representations to characterize several non-holonomic power series and prove non-trivial identities. We discuss this algorithm and its implementation for Maple 2022.

相互獨立的 · 多樣性 · 秩 · 可辨認的 · 線性相關 ·

2021 年 9 月 24 日

Diversity, Dependence and Independence

Pietro Galliani,Jouko V??n?nen

We introduce the concepts of dependence and independence in a very general framework. We use a concept of rank to study dependence and independence. By means of the rank we identify (total) dependence with inability to create more diversity, and (total) independence with the presence of maximum diversity. We show that our theory of dependence and independence covers a variety of dependence concepts, for example the seemingly unrelated concepts of linear dependence in algebra and dependence of variables in logic.

殘差函數 · 泛函 · SimPLe · 正則化 · 方陣 ·

2021 年 9 月 24 日

Tight error bounds and facial residual functions for the $p$-cones and beyond

Scott B. Lindstrom,Bruno F. Louren?o,Ting Kei Pong

from arxiv, 29 pages, comments welcome

We prove tight H\"olderian error bounds for all $p$-cones. Surprisingly, the exponents differ in several ways from those that have been previously conjectured; moreover, they illuminate $p$-cones as a curious example of a class of objects that possess properties in 3 dimensions that they do not in 4 or more. Using our error bounds, we analyse least squares problems with $p$-norm regularization, where our results enable us to compute the corresponding KL exponents for previously inaccessible values of $p$. Another application is a (relatively) simple proof that most $p$-cones are neither self-dual nor homogeneous. Our error bounds are obtained under the framework of facial residual functions and we expand it by establishing for general cones an optimality criterion under which the resulting error bound must be tight.

泛化理論 · 動量 · 隨機梯度下降 · SGD · Lipschitz ·

2021 年 9 月 23 日

On the Generalization of Stochastic Gradient Descent with Momentum

Ali Ramezani-Kebrya,Ashish Khisti,Ben Liang

from arxiv, This entry is redundant and was created in error. See arXiv:1809.04564 for the latest version

While momentum-based methods, in conjunction with stochastic gradient descent (SGD), are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that there exists a convex loss function for which algorithmic stability fails to establish generalization guarantees when SGD with standard heavy-ball momentum (SGDM) is run for multiple epochs. Then, for smooth Lipschitz loss functions, we analyze a modified momentum-based update rule, i.e., SGD with early momentum (SGDEM), and show that it admits an upper-bound on the generalization error. Thus, our results show that machine learning models can be trained for multiple epochs of SGDEM with a guarantee for generalization. Finally, for the special case of strongly convex loss functions, we find a range of momentum such that multiple epochs of standard SGDM, as a special form of SGDEM, also generalizes. Extending our results on generalization, we also develop an upper-bound on the expected true risk, in terms of the number of training steps, the size of the training set, and the momentum parameter. Experimental evaluations verify the consistency between the numerical results and our theoretical bounds and the effectiveness of SGDEM for smooth Lipschitz loss functions.

估計/估計量 · MoDELS · 子采樣 · 觀測變量 · BEGAN ·

2021 年 9 月 22 日

A Wavelet Method for Panel Models with Jump Discontinuities in the Parameters

Oualid Bada,Alois Kneip,Dominik Liebl,Tim Mensinger,James Gualtieri,Robin C. Sickles

While a substantial literature on structural break change point analysis exists for univariate time series, research on large panel data models has not been as extensive. In this paper, a novel method for estimating panel models with multiple structural changes is proposed. The breaks are allowed to occur at unknown points in time and may affect the multivariate slope parameters individually. Our method adapts Haar wavelets to the structure of the observed variables in order to detect the change points of the parameters consistently. We also develop methods to address endogenous regressors within our modeling framework. The asymptotic property of our estimator is established. In our application, we examine the impact of algorithmic trading on standard measures of market quality such as liquidity and volatility over a time period that covers the financial meltdown that began in 2007. We are able to detect jumps in regression slope parameters automatically without using ad-hoc subsample selection criteria.

UniFormer · 稀疏 · 離散化 · 閾值 · 邦弗朗尼校正 ·

2021 年 9 月 22 日

Sparse Uniformity Testing

Bhaswar B. Bhattacharya,Rajarshi Mukherjee

from arxiv, 32 pages, 1 figure

In this paper we consider the uniformity testing problem for high-dimensional discrete distributions (multinomials) under sparse alternatives. More precisely, we derive sharp detection thresholds for testing, based on $n$ samples, whether a discrete distribution supported on $d$ elements differs from the uniform distribution only in $s$ (out of the $d$) coordinates and is $\varepsilon$-far (in total variation distance) from uniformity. Our results reveal various interesting phase transitions which depend on the interplay of the sample size $n$ and the signal strength $\varepsilon$ with the dimension $d$ and the sparsity level $s$. For instance, if the sample size is less than a threshold (which depends on $d$ and $s$), then all tests are asymptotically powerless, irrespective of the magnitude of the signal strength. On the other hand, if the sample size is above the threshold, then the detection boundary undergoes a further phase transition depending on the signal strength. Here, a $\chi^2$-type test attains the detection boundary in the dense regime, whereas in the sparse regime a Bonferroni correction of two maximum-type tests and a version of the Higher Criticism test is optimal up to sharp constants. These results combined provide a complete description of the phase diagram for the sparse uniformity testing problem across all regimes of the parameters $n$, $d$, and $s$. One of the challenges in dealing with multinomials is that the parameters are always constrained to lie in the simplex. This results in the aforementioned two-layered phase transition, a new phenomenon which does not arise in classical high-dimensional sparse testing problems.

優化器 · binary · CASE · 泛函 · 樣本 ·

2021 年 9 月 21 日

Finite-Length Bounds on Hypothesis Testing Subject to Vanishing Type I Error Restrictions

Sebastian Espinosa,Jorge F. Silva,Pablo Piantanida

A central problem in Binary Hypothesis Testing (BHT) is to determine the optimal tradeoff between the Type I error (referred to as false alarm) and Type II (referred to as miss) error. In this context, the exponential rate of convergence of the optimal miss error probability -- as the sample size tends to infinity -- given some (positive) restrictions on the false alarm probabilities is a fundamental question to address in theory. Considering the more realistic context of a BHT with a finite number of observations, this paper presents a new non-asymptotic result for the scenario with monotonic (sub-exponential decreasing) restriction on the Type I error probability, which extends the result presented by Strassen in 2009. Building on the use of concentration inequalities, we offer new upper and lower bounds to the optimal Type II error probability for the case of finite observations. Finally, the derived bounds are evaluated and interpreted numerically (as a function of the number samples) for some vanishing Type I error restrictions.

相互獨立的 · 近似 · INFORMS · 極小點 · 易處理的 ·

2021 年 9 月 21 日

On the Exponential Approximation of Type II Error Probability of Distributed Test of Independence

Sebastian Espinosa,Jorge F. Silva,Pablo Piantanida

This paper studies distributed binary test of statistical independence under communication (information bits) constraints. While testing independence is very relevant in various applications, distributed independence test is particularly useful for event detection in sensor networks where data correlation often occurs among observations of devices in the presence of a signal of interest. By focusing on the case of two devices because of their tractability, we begin by investigating conditions on Type I error probability restrictions under which the minimum Type II error admits an exponential behavior with the sample size. Then, we study the finite sample-size regime of this problem. We derive new upper and lower bounds for the gap between the minimum Type II error and its exponential approximation under different setups, including restrictions imposed on the vanishing Type I error probability. Our theoretical results shed light on the sample-size regimes at which approximations of the Type II error probability via error exponents became informative enough in the sense of predicting well the actual error probability. We finally discuss an application of our results where the gap is evaluated numerically, and we show that exponential approximations are not only tractable but also a valuable proxy for the Type II probability of error in the finite-length regime.

估計/估計量 · 估計誤差 · MoDELS · 學成 · 無偏 ·

2020 年 12 月 17 日

The Causal Learning of Retail Delinquency

Yiyan Huang,Cheuk Hang Leung,Xing Yan,Qi Wu,Nanbo Peng,Dongdong Wang,Zhixiang Huang

from arxiv, This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

估計/估計量 · 話題模型 · 話題 · 優化器 · FAST ·

2018 年 6 月 12 日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Xin Bing,Florentina Bunea,Marten Wegkamp

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.