国产白浆一区二区无码视频在线-亚日韩中文无码视频

We propose an estimator of the kernel-based conditional mean dependence measure obtained from an appropriate modification of a naive estimator based on usual empirical estimators. We then get asymptotic normality of this estimator both under conditional mean independence hypothesis and under the alternative hypothesis. A new test for conditional mean independence of random variables valued into Hilbert spaces is then introduced.

相關內容

估計/估計量

關注 3

估計/估計量 · 方差 · 置信度 · 可辨認的 · 覆蓋 ·

2022 年 9 月 19 日

Sharp bounds for variance of treatment effect estimators in the finite population in the presence of covariates

Ruoyu Wang,Qihua Wang,Wang Miao,Xiaohua Zhou

from arxiv, Accepted by Statistica Sinica

In a completely randomized experiment, the variances of treatment effect estimators in the finite population are usually not identifiable and hence not estimable. Although some estimable bounds of the variances have been established in the literature, few of them are derived in the presence of covariates. In this paper, the difference-in-means estimator and the Wald estimator are considered in the completely randomized experiment with perfect compliance and noncompliance, respectively. Sharp bounds for the variances of these two estimators are established when covariates are available. Furthermore, consistent estimators for such bounds are obtained, which can be used to shorten the confidence intervals and improve the power of tests. Confidence intervals are constructed based on the consistent estimators of the upper bounds, whose coverage rates are uniformly asymptotically guaranteed. Simulations were conducted to evaluate the proposed methods. The proposed methods are also illustrated with two real data analyses.

估計/估計量 · 混合專家模型 · MoDELS · 統計量 · 特征選擇 ·

2022 年 9 月 19 日

An $l_1$-oracle inequality for the Lasso in high-dimensional mixtures of experts models

TrungTin Nguyen,Hien D Nguyen,Faicel Chamroukhi,Geoffrey J McLachlan

from arxiv, Added more explanations. Amended title

Mixtures of experts (MoE) models are a popular framework for modeling heterogeneity in data, for both regression and classification problems in statistics and machine learning, due to their flexibility and the abundance of available statistical estimation and model choice tools. Such flexibility comes from allowing the mixture weights (or gating functions) in the MoE model to depend on the explanatory variables, along with the experts (or component densities). This permits the modeling of data arising from more complex data generating processes when compared to the classical finite mixtures and finite mixtures of regression models, whose mixing parameters are independent of the covariates. The use of MoE models in a high-dimensional setting, when the number of explanatory variables can be much larger than the sample size, is challenging from a computational point of view, and in particular from a theoretical point of view, where the literature is still lacking results for dealing with the curse of dimensionality, for both the statistical estimation and feature selection problems. We consider the finite MoE model with soft-max gating functions and Gaussian experts for high-dimensional regression on heterogeneous data, and its $l_1$-regularized estimation via the Lasso. We focus on the Lasso estimation properties rather than its feature selection properties. We provide a lower bound on the regularization parameter of the Lasso function that ensures an $l_1$-oracle inequality satisfied by the Lasso estimator according to the Kullback--Leibler loss.

內積 · 估計/估計量 · 外積 · MoDELS · Analysis ·

2022 年 9 月 18 日

Derivation of an Inverse Spatial Autoregressive Model for Estimating Moran's Index

Yanguang Chen

from arxiv, 25 pages, 2 figures, 2 tables

Spatial autocorrelation measures such as Moran's index can be expressed as a pair of equations based on a standardized size variable and a globally normalized weight matrix. One is based on inner product, and the other is based on outer product of the size variable. The inner product equation is actually a spatial autocorrelation model. However, the theoretical basis of the inner product equation for Moran's index is not clear. This paper is devoted to revealing the antecedents and consequences of the inner product equation of Moran's index. The method is mathematical derivation and empirical analysis. The main results are as follows. First, the inner product equation is derived from a simple spatial autoregressive model, and thus the relation between Moran's index and spatial autoregressive coefficient is clarified. Second, the least squares regression is proved to be one of effective approaches for estimating spatial autoregressive coefficient. Third, the value ranges of the spatial autoregressive coefficient can be identified from three angles of view. A conclusion can be drawn that a spatial autocorrelation model is actually an inverse spatial autoregressive model, and Moran's index and spatial autoregressive models can be integrated into the same framework through inner product and outer product equations. This work may be helpful for understanding the connections and differences between spatial autocorrelation measurements and spatial autoregressive modeling.

INFORMS · Learning · 極大 · 正則化項 · 特征空間 ·

2022 年 9 月 16 日

Self-Supervised Learning with an Information Maximization Criterion

Serdar Ozsoy,Shadi Hamdan,Sercan ?. Arik,Deniz Yuret,Alper T. Erdogan

Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In this article, we argue that a straightforward application of information maximization among alternative latent representations of the same input naturally solves the collapse problem and achieves competitive empirical results. We propose a self-supervised learning method, CorInfoMax, that uses a second-order statistics-based mutual information measure that reflects the level of correlation among its arguments. Maximizing this correlative information measure between alternative representations of the same input serves two purposes: (1) it avoids the collapse problem by generating feature vectors with non-degenerate covariances; (2) it establishes relevance among alternative representations by increasing the linear dependence among them. An approximation of the proposed information maximization objective simplifies to a Euclidean distance-based objective function regularized by the log-determinant of the feature covariance matrix. The regularization term acts as a natural barrier against feature space degeneracy. Consequently, beyond avoiding complete output collapse to a single point, the proposed approach also prevents dimensional collapse by encouraging the spread of information across the whole feature space. Numerical experiments demonstrate that CorInfoMax achieves better or competitive performance results relative to the state-of-the-art SSL approaches.

潛變量/隱變量 · 推斷 · Networking · 潛在 · Neural Networks ·

2022 年 9 月 16 日

Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables

Yaniv Yacoby,Weiwei Pan,Finale Doshi-Velez

from arxiv, Accepted at JMLR 2022. Previously accepted at ICML's Uncertainty and Robustness in Deep Learning Workshop 2019

Bayesian Neural Networks with Latent Variables (BNN+LVs) capture predictive uncertainty by explicitly modeling model uncertainty (via priors on network weights) and environmental stochasticity (via a latent input noise variable). In this work, we first show that BNN+LV suffers from a serious form of non-identifiability: explanatory power can be transferred between the model parameters and latent variables while fitting the data equally well. We demonstrate that as a result, in the limit of infinite data, the posterior mode over the network weights and latent variables is asymptotically biased away from the ground-truth. Due to this asymptotic bias, traditional inference methods may in practice yield parameters that generalize poorly and misestimate uncertainty. Next, we develop a novel inference procedure that explicitly mitigates the effects of likelihood non-identifiability during training and yields high-quality predictions as well as uncertainty estimates. We demonstrate that our inference method improves upon benchmark methods across a range of synthetic and real data-sets.

估計/估計量 · 穩健性 · 查準率/準確率 · 泛函 · 精度矩陣 ·

2022 年 9 月 15 日

The Influence Function of Graphical Lasso Estimators

Ga?tan Louvet,Jakob Raymaekers,Germain Van Bever,Ines Wilms

The precision matrix that encodes conditional linear dependency relations among a set of variables forms an important object of interest in multivariate analysis. Sparse estimation procedures for precision matrices such as the graphical lasso (Glasso) gained popularity as they facilitate interpretability, thereby separating pairs of variables that are conditionally dependent from those that are independent (given all other variables). Glasso lacks, however, robustness to outliers. To overcome this problem, one typically applies a robust plug-in procedure where the Glasso is computed from a robust covariance estimate instead of the sample covariance, thereby providing protection against outliers. In this paper, we study such estimators theoretically, by deriving and comparing their influence function, sensitivity curves and asymptotic variances.

Analysis · 不變 · 近似 · 似然 · 向量空間 ·

2022 年 9 月 15 日

On the detrimental effect of invariances in the likelihood for variational inference

Richard Kurle,Ralf Herbrich,Tim Januschowski,Yuyang Wang,Jan Gasthaus

Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.

估計/估計量 · 核化 · 近鄰 · Analysis · Learning ·

2022 年 9 月 14 日

Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates

George H. Chen

from arxiv, International Conference on Machine Learning (ICML 2019); this draft includes minor corrections

We establish the first nonasymptotic error bounds for Kaplan-Meier-based nearest neighbor and kernel survival probability estimators where feature vectors reside in metric spaces. Our bounds imply rates of strong consistency for these nonparametric estimators and, up to a log factor, match an existing lower bound for conditional CDF estimation. Our proof strategy also yields nonasymptotic guarantees for nearest neighbor and kernel variants of the Nelson-Aalen cumulative hazards estimator. We experimentally compare these methods on four datasets. We find that for the kernel survival estimator, a good choice of kernel is one learned using random survival forests.

contrastive · 學成 · Performer · 表示學習 · 局部式表示/局部式表征 ·

2021 年 3 月 10 日

Spatially Consistent Representation Learning

Byungseok Roh,Wuhyun Shin,Ildoo Kim,Sungwoong Kim

from arxiv, Accepted by CVPR 2021

Self-supervised learning has been widely used to obtain transferrable representations from unlabeled images. Especially, recent contrastive learning methods have shown impressive performances on downstream image classification tasks. While these contrastive methods mainly focus on generating invariant global representations at the image-level under semantic-preserving transformations, they are prone to overlook spatial consistency of local representations and therefore have a limitation in pretraining for localization tasks such as object detection and instance segmentation. Moreover, aggressively cropped views used in existing contrastive methods can minimize representation distances between the semantically different regions of a single image. In this paper, we propose a spatially consistent representation learning algorithm (SCRL) for multi-object and location-specific tasks. In particular, we devise a novel self-supervised objective that tries to produce coherent spatial representations of a randomly cropped local region according to geometric translations and zooming operations. On various downstream localization tasks with benchmark datasets, the proposed SCRL shows significant performance improvements over the image-level supervised pretraining as well as the state-of-the-art self-supervised learning methods.

條件隨機場 · 隨機場 · INFORMS · 圖像分割 · 卷積神經網絡 ·

2017 年 12 月 27 日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Fahim Irfan Alam,Jun Zhou,Alan Wee-Chung Liew,Xiuping Jia,Jocelyn Chanussot,Yongsheng Gao

from arxiv, Submitted for Journal (Version 2)

Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual information and thus improving the segmentation performance. In this paper, we propose a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF. We use multiple spectral cubes to learn deep features using CNN, and then formulate deep CRF with CNN-based unary and pairwise potential functions to effectively extract the semantic correlations between patches consisting of three-dimensional data cubes. Effective piecewise training is applied in order to avoid the computationally expensive iterative CRF inference. Furthermore, we introduce a deep deconvolution network that improves the segmentation masks. We also introduce a new dataset and experimented our proposed method on it along with several widely adopted benchmark datasets to evaluate the effectiveness of our method. By comparing our results with those from several state-of-the-art models, we show the promising potential of our method.