蜜芽亚洲精品国产品国语在线试看,亚洲国产精品成人综合一区

An increasingly important data analytic challenge is understanding the relationships between subpopulations. Various visualization methods that provide many useful insights into those relationships are popular, especially in bioinformatics. This paper proposes a novel and rigorous approach to quantifying subpopulation relationships called the Population Difference Criterion (PDC). PDC is simultaneously a quantitative and visual approach to showing separation of subpopulations. It uses subpopulation centers, the respective variation about those centers and the relative subpopulation sizes. This is accomplished by drawing motivation for the PDC from classical permutation based hypothesis testing, while taking that type of idea into non-standard conceptual territory. In particular, the domain of very small P values is seen to seem to provide useful comparisons of data sets. Simulated permutation variation is carefully investigated, and we found that a balanced permutation approach is more informative in high signal (i.e large subpopulation difference) contexts, than conventional approaches based on all permutations. This result is quite surprising in view of related work done in low signal contexts, which came to the opposite conclusion. This issue is resolved by the proposal of an appropriate adjustment. Permutation variation is also quantified by a proposed bootstrap confidence interval, and demonstrated to be useful in understanding subpopulation relationships with cancer data.

相關內容

可理解性

關注 6

MoDELS · 講稿 · state-of-the-art · 估計/估計量 · 相關系數 ·

2023 年 2 月 7 日

Auditing Gender Presentation Differences in Text-to-Image Models

Yanzhe Zhang,Lu Jiang,Greg Turk,Diyi Yang

from arxiv, Preprint, 23 pages, 14 figures

Text-to-image models, which can generate high-quality images based on textual input, have recently enabled various content-creation tools. Despite significantly affecting a wide range of downstream applications, the distributions of these generated images are still not fully understood, especially when it comes to the potential stereotypical attributes of different genders. In this work, we propose a paradigm (Gender Presentation Differences) that utilizes fine-grained self-presentation attributes to study how gender is presented differently in text-to-image models. By probing gender indicators in the input text (e.g., "a woman" or "a man"), we quantify the frequency differences of presentation-centric attributes (e.g., "a shirt" and "a dress") through human annotation and introduce a novel metric: GEP. Furthermore, we propose an automatic method to estimate such differences. The automatic GEP metric based on our approach yields a higher correlation with human annotations than that based on existing CLIP scores, consistently across three state-of-the-art text-to-image models. Finally, we demonstrate the generalization ability of our metrics in the context of gender stereotypes related to occupations.

估計/估計量 · IM · 推斷 · INFORMS · Lipschitz ·

2023 年 2 月 7 日

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

Daniele Gorla,Louis Jalouzot,Federica Granese,Catuscia Palamidessi,Pablo Piantanida

We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to them (the so-called ''black-box'' scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely {\epsilon}-LDP and R\'enyi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees; this result also holds for the central versions of the two notions of DP considered. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. We validate our results experimentally and we note that, on a particularly well-behaved distribution (namely, the Laplace noise), our method gives even better results than expected, in the sense that in practice the number of samples needed to achieve the desired confidence is smaller than the theoretical bound, and the estimation of {\epsilon} is more precise than predicted.

Projection · 卷積 · Integration · 示例 ·

2023 年 2 月 7 日

Projected distances for multi-parameter persistence modules

Nicolas Berkouk,Francois Petit

from arxiv, 56 pages, 6 figures, several small corrections, new introduction

Relying on sheaf theory, we introduce the notions of projected barcodes and projected distances for multi-parameter persistence modules. Projected barcodes are defined as derived pushforward of persistence modules onto $\mathbb{R}$. Projected distances come in two flavors: the integral sheaf metrics (ISM) and the sliced convolution distances (SCD). We conduct a systematic study of the stability of projected barcodes and show that the fibered barcode is a particular instance of projected barcodes. We prove that the ISM and the SCD provide lower bounds for the convolution distance. Furthermore, we show that the $\gamma$-linear ISM and the $\gamma$-linear SCD which are projected distances tailored for $\gamma$-sheaves can be computed using TDA software dedicated to one-parameter persistence modules. Moreover, the time and memory complexity required to compute these two metrics are advantageous since our approach does not require computing nor storing an entire $n$-persistence module.

估計/估計量 · 泛函 · 線性的 · 有偏 · 方差 ·

2023 年 2 月 7 日

Consistent and Asymptotically Efficient Localization from Range-Difference Measurements

Guangyang Zeng,Biqiang Mu,Ling Shi,Jiming Chen,Junfeng Wu

We consider signal source localization from range-difference measurements. First, we give some readily-checked conditions on measurement noises and sensor deployment to guarantee the asymptotic identifiability of the model and show the consistency and asymptotic normality of the maximum likelihood (ML) estimator. Then, we devise an estimator that owns the same asymptotic property as the ML one. Specifically, we prove that the negative log-likelihood function converges to a function, which has a unique minimum and positive-definite Hessian at the true source's position. Hence, it is promising to execute local iterations, e.g., the Gauss-Newton (GN) algorithm, following a consistent estimate. The main issue involved is obtaining a preliminary consistent estimate. To this aim, we construct a linear least-squares problem via algebraic operation and constraint relaxation and obtain a closed-form solution. We then focus on deriving and eliminating the bias of the linear least-squares estimator, which yields an asymptotically unbiased (thus consistent) estimate. Noting that the bias is a function of the noise variance, we further devise a consistent noise variance estimator which involves $3$-order polynomial rooting. Based on the preliminary consistent location estimate, we prove that a one-step GN iteration suffices to achieve the same asymptotic property as the ML estimator. Simulation results demonstrate the superiority of our proposed algorithm in the large sample case.

MoDELS · 控制器 · 輸出 · Guidance · 逼真度 ·

2023 年 2 月 6 日

Structure and Content-Guided Video Synthesis with Diffusion Models

Patrick Esser,Johnathan Chiu,Parmida Atighehchian,Jonathan Granskog,Anastasis Germanidis

from arxiv, Project page at //research.runwayml.com/gen1

Text-guided generative diffusion models unlock powerful image creation and editing tools. While these have been extended to video generation, current approaches that edit the content of existing footage while retaining structure require expensive re-training for every input or rely on error-prone propagation of image edits across frames. In this work, we present a structure and content-guided video diffusion model that edits videos based on visual or textual descriptions of the desired output. Conflicts between user-provided content edits and structure representations occur due to insufficient disentanglement between the two aspects. As a solution, we show that training on monocular depth estimates with varying levels of detail provides control over structure and content fidelity. Our model is trained jointly on images and videos which also exposes explicit control of temporal consistency through a novel guidance method. Our experiments demonstrate a wide variety of successes; fine-grained control over output characteristics, customization based on a few reference images, and a strong user preference towards results by our model.

貝葉斯網/貝葉斯網絡 · Networking · 結構化學習 · Performer · Learning ·

2023 年 2 月 6 日

The Dual PC Algorithm and the Role of Gaussianity for Structure Learning of Bayesian Networks

Enrico Giudice,Jack Kuipers,Giusi Moffa

Learning the graphical structure of Bayesian networks is key to describing data-generating mechanisms in many complex applications but poses considerable computational challenges. Observational data can only identify the equivalence class of the directed acyclic graph underlying a Bayesian network model, and a variety of methods exist to tackle the problem. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by reverse-engineering the conditional independence (CI) relationships holding in the variable distribution. The dual PC algorithm is a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. By exploiting block matrix inversions we can simultaneously perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies show that the dual PC algorithm outperforms the classic PC algorithm both in terms of run time and in recovering the underlying network structure, even in the presence of deviations from Gaussianity. Additionally, we show that the dual PC algorithm applies for Gaussian copula models, and demonstrate its performance in that setting.

估計/估計量 · 有偏 · 統計量 · 穩健性 · INFORMS ·

2023 年 2 月 6 日

A robust fusion-extraction procedure with summary statistics in the presence of biased sources

Ruoyu Wang,Qihua Wang,Wang Miao

Information from various data sources is increasingly available nowadays. However, some of the data sources may produce biased estimation due to commonly encountered biased sampling, population heterogeneity, or model misspecification. This calls for statistical methods to combine information in the presence of biased sources. In this paper, a robust data fusion-extraction method is proposed. The method can produce a consistent estimator of the parameter of interest even if many of the data sources are biased. The proposed estimator is easy to compute and only employs summary statistics, and hence can be applied to many different fields, e.g. meta-analysis, Mendelian randomisation and distributed system. Moreover, the proposed estimator is asymptotically equivalent to the oracle estimator that only uses data from unbiased sources under some mild conditions. Asymptotic normality of the proposed estimator is also established. In contrast to the existing meta-analysis methods, the theoretical properties are guaranteed even if both the number of data sources and the dimension of the parameter diverge as the sample size increases, which ensures the performance of the proposed method over a wide range. The robustness and oracle property is also evaluated via simulation studies. The proposed method is applied to a meta-analysis data set to evaluate the surgical treatment for the moderate periodontal disease, and a Mendelian randomization data set to study the risk factors of head and neck cancer.

能量函數 · 稀疏 · Networking · 泛函 · Neural Networks ·

2023 年 2 月 6 日

Polynomial-time sparse measure recovery

Hadi Daneshmand,Francis Bach

Many problems in computer science reduce to the recovery of an $n$-sparse measure from its (generalized) moments. Sparse measure recovery has been the research focus in super-resolution, tensor decomposition, and learning neural networks. The existing methods use either convex relaxations or overparameterization for recovery. Here, we propose recovery with non-convex optimization without overparameterization. Our algorithm is a (sub)gradient descent method optimizing a non-convex energy function studied in physics. We establish the global convergence of gradient descent on the energy function. This result enables us to solve super-resolution in $O(n^2)$ time, which significantly improves upon $O(n^3)$ time for solving convex relaxations. For a particular neural network, we prove the global convergence of subgradient descent on the population loss without overparameterization. The studied network has zero-one activations, and inputs drawn from the unit sphere.

Performer · 估計/估計量 · motivation · Better · 有偏 ·

2023 年 2 月 3 日

What does not get observed can be used to make age curves stronger: estimating player age curves using regression and imputation

Michael Schuckers,Michael Lopez,Brian Macdonald

The impact of player age on performance has received attention across sport. Most research has focused on the performance of players at each age, ignoring the reality that age likewise influences which players receive opportunities to perform. Our manuscript makes two contributions. First, we highlight how selection bias is linked to both (i) which players receive opportunity to perform in sport, and (ii) at which ages we observe these players perform. This approach is used to generate underlying distributions of how players move in and out of sport organizations. Second, motivated by methods for missing data, we propose novel estimation methods of age curves by using both observed and unobserved (imputed) data. We use simulations to compare several comparative approaches for estimating aging curves. Imputation-based methods, as well as models that account for individual player skill, tend to generate lower RMSE and age curve shapes that better match the truth. We implement our approach using data from the National Hockey League.

估計/估計量 · 中位數 · Weight · 變換 · contrastive ·

2023 年 2 月 3 日

Confounding-adjustment methods for the causal difference in medians

Daisy A. Shepherd,Benjamin R. Baer,Margarita Moreno-Betancur

from arxiv, Main paper: 18 pages, 2 figures, 2 tables. Supplementary material (additional): 8 pages, 2 figures, 3 tables

With continuous outcomes, the average causal effect is typically defined using a contrast of expected potential outcomes. However, in the presence of skewed outcome data, the expectation may no longer be meaningful. In practice the typical approach is to either "ignore or transform" - ignore the skewness altogether or transform the outcome to obtain a more symmetric distribution, although neither approach is entirely satisfactory. Alternatively the causal effect can be redefined as a contrast of median potential outcomes, yet discussion of confounding-adjustment methods to estimate this parameter is limited. In this study we described and compared confounding-adjustment methods to address this gap. The methods considered were multivariable quantile regression, an inverse probability weighted (IPW) estimator, weighted quantile regression and two little-known implementations of g-computation for this problem. Motivated by a cohort investigation in the Longitudinal Study of Australian Children, we conducted a simulation study that found the IPW estimator, weighted quantile regression and g-computation implementations minimised bias when the relevant models were correctly specified, with g-computation additionally minimising the variance. These methods provide appealing alternatives to the common "ignore or transform" approach and multivariable quantile regression, enhancing our capability to obtain meaningful causal effect estimates with skewed outcome data.