国产成人精品三级在线,又刺激又舒服又色又爽的视频,国产日韩欧美一区二区首页,自拍偷拍一区视频,日本久久久精品影院老司机

The identification of the dependent components in multiple data sets is a fundamental problem in many practical applications. The challenge in these applications is that often the data sets are high-dimensional with few observations or available samples and contain latent components with unknown probability distributions. A novel mathematical formulation of this problem is proposed, which enables the inference of the underlying correlation structure with strict false positive control. In particular, the false discovery rate is controlled at a pre-defined threshold on two levels simultaneously. The deployed test statistics originate in the sample coherence matrix. The required probability models are learned from the data using the bootstrap. Local false discovery rates are used to solve the multiple hypothesis testing problem. Compared to the existing techniques in the literature, the developed technique does not assume an a priori correlation structure and work well when the number of data sets is large while the number of observations is small. In addition, it can handle the presence of distributional uncertainties, heavy-tailed noise, and outliers.

相關內容

相關系數

關注 0

估計/估計量 · 置信度 · 對數幾率 · 變換 · MoDELS ·

2023 年 7 月 21 日

Small Sample Estimators for Two-way Capture Recapture Experiments

Louis-Paul Rivest,Mamadou Yauck

The properties of the generalized Waring distribution defined on the non negative integers are reviewed. Formulas for its moments and its mode are given. A construction as a mixture of negative binomial distributions is also presented. Then we turn to the Petersen model for estimating the population size $N$ in a two-way capture recapture experiment. We construct a Bayesian model for $N$ by combining a Waring prior with the hypergeometric distribution for the number of units caught twice in the experiment. Confidence intervals for $N$ are obtained using quantiles of the posterior, a generalized Waring distribution. The standard confidence interval for the population size constructed using the asymptotic variance of Petersen estimator and .5 logit transformed interval are shown to be special cases of the generalized Waring confidence interval. The true coverage of this interval is shown to be bigger than or equal to its nominal converage in small populations, regardless of the capture probabilities. In addition, its length is substantially smaller than that of the .5 logit transformed interval. Thus a generalized Waring confidence interval appears to be the best way to quantify the uncertainty of the Petersen estimator for populations size.

估計/估計量 · 樣本 · Processing（編程語言） · Performer · 周期的 ·

2023 年 7 月 21 日

Population Size Estimation for Respondent-Driven Sampling and Capture-Recapture: A Unifying Framework

Mamadou Yauck

from arxiv, The entire paper is being rewritten

This paper deals with the estimation of population sizes for respondent-driven sampling (RDS), a variant of link-tracing sampling that leverages social networks over a number of waves to recruit individuals from hidden populations. The RDS process is mostly controlled by individual participants who might report on recruitment proposals, or nominations, that they have received or given. By considering all nominations given or received over a time period, one can create a capture-recapture dataset in which units are individuals who have received at least one nomination and capture occasions are either time intervals or recruitment waves, with the goal of estimating the size $N$ of the hidden population. In this paper, we argue that the underlying process that generated the RDS nomination data is that of a capture-recapture experiment. We then proposed a methodology for the estimation of the population size and investigated its performance against departures from classical capture-recapture assumptions.

可約的 · 噪聲 · 方差 · Networking · Neural Networks ·

2023 年 7 月 21 日

Reduction of finite sampling noise in quantum neural networks

David A. Kreplin,Marco Roth

from arxiv, 11 pages, 10 figures; refined section 5

Quantum neural networks (QNNs) use parameterized quantum circuits with data-dependent inputs and generate outputs through the evaluation of expectation values. Calculating these expectation values necessitates repeated circuit evaluations, thus introducing fundamental finite-sampling noise even on error-free quantum computers. We reduce this noise by introducing the variance regularization, a technique for reducing the variance of the expectation value during the quantum model training. This technique requires no additional circuit evaluations if the QNN is properly constructed. Our empirical findings demonstrate the reduced variance speeds up the training and lowers the output noise as well as decreases the number of necessary evaluations of gradient circuits. This regularization method is benchmarked on the regression of multiple functions. We show that in our examples, it lowers the variance by an order of magnitude on average and leads to a significantly reduced noise level of the QNN. We finally demonstrate QNN training on a real quantum device and evaluate the impact of error mitigation. Here, the optimization is feasible only due to the reduced number of necessary shots in the gradient evaluation resulting from the reduced variance.

統計量 · 線性回歸 · 線性的 · Analysis · MoDELS ·

2023 年 7 月 21 日

Statistical analysis for a penalized EM algorithm in high-dimensional mixture linear regression model

Ning Wang,Xin Zhang,Qing Mai

The expectation-maximization (EM) algorithm and its variants are widely used in statistics. In high-dimensional mixture linear regression, the model is assumed to be a finite mixture of linear regression and the number of predictors is much larger than the sample size. The standard EM algorithm, which attempts to find the maximum likelihood estimator, becomes infeasible for such model. We devise a group lasso penalized EM algorithm and study its statistical properties. Existing theoretical results of regularized EM algorithms often rely on dividing the sample into many independent batches and employing a fresh batch of sample in each iteration of the algorithm. Our algorithm and theoretical analysis do not require sample-splitting, and can be extended to multivariate response cases. The proposed methods also have encouraging performances in numerical studies.

確切的 · 相互獨立的 · 塊 · MoDELS · 閾值 ·

2023 年 7 月 20 日

Exact recovery for the non-uniform Hypergraph Stochastic Block Model

Ioana Dumitriu,Haixiao Wang

from arxiv, 57 pages, 2 tables, 7 figures. Part of the result is separated as an independent paper in arxiv.org/abs/2306.06845

Consider the community detection problem in random hypergraphs under the non-uniform hypergraph stochastic block model (HSBM), where each hyperedge appears independently with some given probability depending only on the labels of its vertices. We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints; in particular, we consider the model with multiple communities ($K \geq 2$). One crucial point here is that by aggregating information from all the uniform layers, we may obtain exact recovery even in cases when this may appear impossible if each layer were considered alone. Two efficient algorithms that successfully achieve exact recovery above the threshold are provided. The theoretical analysis of our algorithms relies on the concentration and regularization of the adjacency matrix for non-uniform random hypergraphs, which could be of independent interest. We also address some open problems regarding parameter knowledge and estimation.

UniFormer · 相互獨立的 · 類別 · 統計量 · 泛函 ·

2023 年 7 月 20 日

Differential uniformity properties of some classes of permutation polynomials

Kirpa Garg,Sartaj Ul Hasan,Pantelimon Stanica

The notion of $c$-differential uniformity has recently received a lot of attention since its proposal~\cite{Ellingsen}, and recently a characterization of perfect $c$-nonlinear functions in terms of difference sets in some quasigroups was obtained in~\cite{AMS22}. Independent of their applications as a measure for certain statistical biases, the construction of functions, especially permutations, with low $c$-differential uniformity is an interesting mathematical problem in this area, and recent work has focused heavily in this direction. We provide a few classes of permutation polynomials with low $c$-differential uniformity. The used technique involves handling various Weil sums, as well as analyzing some equations in finite fields, and we believe these can be of independent interest.

估計/估計量 · 條件隨機場 · 方差 · Machine Learning · Learning ·

2023 年 7 月 20 日

Flexible machine learning estimation of conditional average treatment effects: a blessing and a curse

Richard Post,Isabel van den Heuvel,Marko Petkovic,Edwin van den Heuvel

Causal inference from observational data requires untestable identification assumptions. If these assumptions apply, machine learning (ML) methods can be used to study complex forms of causal effect heterogeneity. Recently, several ML methods were developed to estimate the conditional average treatment effect (CATE). If the features at hand cannot explain all heterogeneity, the individual treatment effects (ITEs) can seriously deviate from the CATE. In this work, we demonstrate how the distributions of the ITE and the CATE can differ when a causal random forest (CRF) is applied. We extend the CRF to estimate the difference in conditional variance between treated and controls. If the ITE distribution equals the CATE distribution, this estimated difference in variance should be small. If they differ, an additional causal assumption is necessary to quantify the heterogeneity not captured by the CATE distribution. The conditional variance of the ITE can be identified when the individual effect is independent of the outcome under no treatment given the measured features. Then, in the cases where the ITE and CATE distributions differ, the extended CRF can appropriately estimate the variance of the ITE distribution while the CRF fails to do so.

INFORMS · PID · 有偏 · 估計/估計量 · INTERACT ·

2023 年 7 月 20 日

Gaussian Partial Information Decomposition: Bias Correction and Application to High-dimensional Data

Praveen Venkatesh,Corbett Bennett,Sam Gale,Tamina K. Ramirez,Greggory Heller,Severine Durand,Shawn Olsen,Stefan Mihalas

Recent advances in neuroscientific experimental techniques have enabled us to simultaneously record the activity of thousands of neurons across multiple brain regions. This has led to a growing need for computational tools capable of analyzing how task-relevant information is represented and communicated between several brain regions. Partial information decompositions (PIDs) have emerged as one such tool, quantifying how much unique, redundant and synergistic information two or more brain regions carry about a task-relevant message. However, computing PIDs is computationally challenging in practice, and statistical issues such as the bias and variance of estimates remain largely unexplored. In this paper, we propose a new method for efficiently computing and estimating a PID definition on multivariate Gaussian distributions. We show empirically that our method satisfies an intuitive additivity property, and recovers the ground truth in a battery of canonical examples, even at high dimensionality. We also propose and evaluate, for the first time, a method to correct the bias in PID estimates at finite sample sizes. Finally, we demonstrate that our Gaussian PID effectively characterizes inter-areal interactions in the mouse brain, revealing higher redundancy between visual areas when a stimulus is behaviorally relevant.

特征向量 · 向量化 · 線性的 · 線性變換 · SimPLe ·

2023 年 7 月 19 日

Repeated Observations for Classification

Hüseyin Af?er,László Gy?rfi,Harro Walk

We study the problem nonparametric classification with repeated observations. Let $\bX$ be the $d$ dimensional feature vector and let $Y$ denote the label taking values in $\{1,\dots ,M\}$. In contrast to usual setup with large sample size $n$ and relatively low dimension $d$, this paper deals with the situation, when instead of observing a single feature vector $\bX$ we are given $t$ repeated feature vectors $\bV_1,\dots ,\bV_t $. Some simple classification rules are presented such that the conditional error probabilities have exponential convergence rate of convergence as $t\to\infty$. In the analysis, we investigate particular models like robust detection by nominal densities, prototype classification, linear transformation, linear classification, scaling.

判別器 · Performer · 降維 · 卷積神經網絡 · 多任務學習 ·

2018 年 1 月 25 日

NDDR-CNN: Layer-wise Feature Fusing in Multi-Task CNN by Neural Discriminative Dimensionality Reduction

Yuan Gao,Qi She,Jiayi Ma,Mingbo Zhao,Wei Liu,Alan L. Yuille

from arxiv, 11 pages, 5 figures, 7 tables

State-of-the-art Convolutional Neural Network (CNN) benefits a lot from multi-task learning (MTL), which learns multiple related tasks simultaneously to obtain shared or mutually related representations for different tasks. The most widely-used MTL CNN structure is based on an empirical or heuristic split on a specific layer (e.g., the last convolutional layer) to minimize different task-specific losses. However, this heuristic sharing/splitting strategy may be harmful to the final performance of one or multiple tasks. In this paper, we propose a novel CNN structure for MTL, which enables automatic feature fusing at every layer. Specifically, we first concatenate features from different tasks according to their channel dimension, and then formulate the feature fusing problem as discriminative dimensionality reduction. We show that this discriminative dimensionality reduction can be done by 1x1 Convolution, Batch Normalization, and Weight Decay in one CNN, which we refer to as Neural Discriminative Dimensionality Reduction (NDDR). We perform ablation analysis in details for different configurations in training the network. The experiments carried out on different network structures and different task sets demonstrate the promising performance and desirable generalizability of our proposed method.