日本一区二区三区不卡网站,无码一级毛片免费,国产成人短视频在线播放,亚洲婷婷丁香五月综合图

The development of technologies for causal inference with the privacy preservation of distributed data has attracted considerable attention in recent years. To address this issue, we propose a data collaboration quasi-experiment (DC-QE) that enables causal inference from distributed data with privacy preservation. In our method, first, local parties construct dimensionality-reduced intermediate representations from the private data. Second, they share intermediate representations, instead of private data for privacy preservation. Third, propensity scores were estimated from the shared intermediate representations. Finally, the treatment effects were estimated from propensity scores. Our method can reduce both random errors and biases, whereas existing methods can only reduce random errors in the estimation of treatment effects. Through numerical experiments on both artificial and real-world data, we confirmed that our method can lead to better estimation results than individual analyses. Dimensionality-reduction loses some of the information in the private data and causes performance degradation. However, we observed that in the experiments, sharing intermediate representations with many parties to resolve the lack of subjects and covariates, our method improved performance enough to overcome the degradation caused by dimensionality-reduction. With the spread of our method, intermediate representations can be published as open data to help researchers find causalities and accumulated as a knowledge base.

相關內容

估計(ji)/估計(ji)量(liang)

關注 0

Analysis · INFORMS · 可理解性 · 知識 (knowledge) · Extensibility ·

2023 年 9 月 25 日

Understanding metric-related pitfalls in image analysis validation

Annika Reinke,Minu D. Tizabi,Michael Baumgartner,Matthias Eisenmann,Doreen Heckmann-N?tzel,A. Emre Kavur,Tim R?dsch,Carole H. Sudre,Laura Acion,Michela Antonelli,Tal Arbel,Spyridon Bakas,Arriel Benis,Matthew Blaschko,Florian Buettner,M. Jorge Cardoso,Veronika Cheplygina,Jianxu Chen,Evangelia Christodoulou,Beth A. Cimini,Gary S. Collins,Keyvan Farahani,Luciana Ferrer,Adrian Galdran,Bram van Ginneken,Ben Glocker,Patrick Godau,Robert Haase,Daniel A. Hashimoto,Michael M. Hoffman,Merel Huisman,Fabian Isensee,Pierre Jannin,Charles E. Kahn,Dagmar Kainmueller,Bernhard Kainz,Alexandros Karargyris,Alan Karthikesalingam,Hannes Kenngott,Jens Kleesiek,Florian Kofler,Thijs Kooi,Annette Kopp-Schneider,Michal Kozubek,Anna Kreshuk,Tahsin Kurc,Bennett A. Landman,Geert Litjens,Amin Madani,Klaus Maier-Hein,Anne L. Martel,Peter Mattson,Erik Meijering,Bjoern Menze,Karel G. M. Moons,Henning Müller,Brennan Nichyporuk,Felix Nickel,Jens Petersen,Susanne M. Rafelski,Nasir Rajpoot,Mauricio Reyes,Michael A. Riegler,Nicola Rieke,Julio Saez-Rodriguez,Clara I. Sánchez,Shravya Shetty,Maarten van Smeden,Ronald M. Summers,Abdel A. Taha,Aleksei Tiulpin,Sotirios A. Tsaftaris,Ben Van Calster,Ga?l Varoquaux,Manuel Wiesenfarth,Ziv R. Yaniv,Paul F. J?ger,Lena Maier-Hein

from arxiv, Shared first authors: Annika Reinke, Minu D. Tizabi; shared senior authors: Paul F. J\"ager, Lena Maier-Hein

Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.

泛函 · 預測器/決策函數 · 聯系函數 · 估計/估計量 · 正則化項 ·

2023 年 9 月 25 日

Functional sufficient dimension reduction through distance covariance

Xing Yang,Jianjun Xu

Our research proposes a novel method for reducing the dimensionality of functional data, specifically for the case where the response is a scalar and the predictor is a random function. Our method utilizes distance covariance, and has several advantages over existing methods. Unlike current techniques which require restrictive assumptions such as linear conditional mean and constant covariance, our method has mild requirements on the predictor. Additionally, our method does not involve the use of the unbounded inverse of the covariance operator. The link function between the response and predictor can be arbitrary, and our proposed method maintains the advantage of being model-free, without the need to estimate the link function. Furthermore, our method is naturally suited for sparse longitudinal data. We utilize functional principal component analysis with truncation as a regularization mechanism in the development of our method. We provide justification for the validity of our proposed method, and establish statistical consistency of the estimator under certain regularization conditions. To demonstrate the effectiveness of our proposed method, we conduct simulation studies and real data analysis. The results show improved performance compared to existing methods.

估計/估計量 · 損失 · 泛函 · 類別 · 風險函數 ·

2023 年 9 月 25 日

On improved estimation of the larger location parameter

Naresh Garg,Lakshmi Kanta Patra,Neeraj Misra

This paper investigates the problem of estimating the larger location parameter of two general location families from a decision-theoretic perspective. In this estimation problem, we use the criteria of minimizing the risk function and the Pitman closeness under a general bowl-shaped loss function. Inadmissibility of a general location and equivariant estimators is provided. We prove that a natural estimator (analogue of the BLEE of unordered location parameters) is inadmissible, under certain conditions on underlying densities, and propose a dominating estimator. We also derive a class of improved estimators using the Kubokawa's IERD approach and observe that the boundary estimator of this class is the Brewster-Zidek type estimator. Additionally, under the generalized Pitman criterion, we show that the natural estimator is inadmissible and obtain improved estimators. The results are implemented for different loss functions, and explicit expressions for the dominating estimators are provided. We explore the applications of these results to for exponential and normal distribution under specified loss functions. A simulation is also conducted to compare the risk performance of the proposed estimators. Finally, we present a real-life data analysis to illustrate the practical applications of the paper's findings.

可辨認的 · 控制器 · INTERACT · Analysis · 樣例 ·

2023 年 9 月 24 日

Interactive identification of individuals with positive treatment effect while controlling false discoveries

Boyan Duan,Larry Wasserman,Aaditya Ramdas

from arxiv, 38 pages, 14 figures

Out of the participants in a randomized experiment with anticipated heterogeneous treatment effects, is it possible to identify which subjects have a positive treatment effect? While subgroup analysis has received attention, claims about individual participants are much more challenging. We frame the problem in terms of multiple hypothesis testing: each individual has a null hypothesis (stating that the potential outcomes are equal, for example) and we aim to identify those for whom the null is false (the treatment potential outcome stochastically dominates the control one, for example). We develop a novel algorithm that identifies such a subset, with nonasymptotic control of the false discovery rate (FDR). Our algorithm allows for interaction -- a human data scientist (or a computer program) may adaptively guide the algorithm in a data-dependent manner to gain power. We show how to extend the methods to observational settings and achieve a type of doubly-robust FDR control. We also propose several extensions: (a) relaxing the null to nonpositive effects, (b) moving from unpaired to paired samples, and (c) subgroup identification. We demonstrate via numerical experiments and theoretical analysis that the proposed method has valid FDR control in finite samples and reasonably high identification power.

泛函 · 廣義函數 · Microsoft Windows · 線性的 · 步幅 ·

2023 年 9 月 23 日

Enumeration of max-pooling responses with generalized permutohedra

Laura Escobar,Patricio Gallardo,Javier González-Anaya,José L. González,Guido Montúfar,Alejandro H. Morales

from arxiv, 35 pages, 11 figures, 4 tables. V2: Improved exposition, added computations in Section 4, and expanded analysis of data

We investigate the combinatorics of max-pooling layers, which are functions that downsample input arrays by taking the maximum over shifted windows of input coordinates, and which are commonly used in convolutional neural networks. We obtain results on the number of linearity regions of these functions by equivalently counting the number of vertices of certain Minkowski sums of simplices. We characterize the faces of such polytopes and obtain generating functions and closed formulas for the number of vertices and facets in a 1D max-pooling layer depending on the size of the pooling windows and stride, and for the number of vertices in a special case of 2D max-pooling.

2023 年 9 月 22 日

Principle variables analysis for non-Gaussian data

Dylan Clark-Boucher,Jeffrey W. Miller

Principal variables analysis (PVA) is a technique for selecting a subset of variables that capture as much of the information in a dataset as possible. Existing approaches for PVA are based on the Pearson correlation matrix, which is not well-suited to describing the relationships between non-Gaussian variables. We propose a generalized approach to PVA enabling the use of different types of correlation, and we explore using Spearman, Gaussian copula, and polychoric correlations as alternatives to Pearson correlation when performing PVA. We compare performance in simulation studies varying the form of the true multivariate distribution over a wide range of possibilities. Our results show that on continuous non-Gaussian data, using generalized PVA with Gaussian copula or Spearman correlations provides a major improvement in performance compared to Pearson. Meanwhile, on ordinal data, generalized PVA with polychoric correlations outperforms the rest by a wide margin. We apply generalized PVA to a dataset of 102 clinical variables measured on individuals with X-linked dystonia parkinsonism (XDP), a rare neurodegenerative disorder, and we find that using different types of correlation yields substantively different sets of principal variables.

相互獨立的 · 條件獨立的 · 特征選擇 · 不變 · MoDELS ·

2023 年 9 月 22 日

Model-based causal feature selection for general response types

Lucas Kook,Sorawit Saengkyongam,Anton Rask Lundborg,Torsten Hothorn,Jonas Peters

from arxiv, Code available at //github.com/LucasKook/tramicp.git

Discovering causal relationships from observational data is a fundamental yet challenging task. In some applications, it may suffice to learn the causal features of a given response variable, instead of learning the entire underlying causal structure. Invariant causal prediction (ICP, Peters et al., 2016) is a method for causal feature selection which requires data from heterogeneous settings. ICP assumes that the mechanism for generating the response from its direct causes is the same in all settings and exploits this invariance to output a subset of the causal features. The framework of ICP has been extended to general additive noise models and to nonparametric settings using conditional independence testing. However, nonparametric conditional independence testing often suffers from low power (or poor type I error control) and the aforementioned parametric models are not suitable for applications in which the response is not measured on a continuous scale, but rather reflects categories or counts. To bridge this gap, we develop ICP in the context of transformation models (TRAMs), allowing for continuous, categorical, count-type, and uninformatively censored responses (we show that, in general, these model classes do not allow for identifiability when there is no exogenous heterogeneity). We propose TRAM-GCM, a test for invariance of a subset of covariates, based on the expected conditional covariance between environments and score residuals which satisfies uniform asymptotic level guarantees. For the special case of linear shift TRAMs, we propose an additional invariance test, TRAM-Wald, based on the Wald statistic. We implement both proposed methods in the open-source R package "tramicp" and show in simulations that under the correct model specification, our approach empirically yields higher power than nonparametric ICP based on conditional independence testing.

講稿 · 分解 · 圖 · 完全圖 · binary ·

2023 年 9 月 21 日

Enumerating combinatorial resultant decompositions of 2-connected rigidity circuits

Goran Malic,Ileana Streinu

from arxiv, 15 pages, 8 figures

A rigidity circuit (in 2D) is a minimal dependent set in the rigidity matroid, i.e. a minimal graph supporting a non-trivial stress in any generic placement of its vertices in $\mathbb R^2$. Any rigidity circuit on $n\geq 5$ vertices can be obtained from rigidity circuits on a fewer number of vertices by applying the combinatorial resultant (CR) operation. The inverse operation is called a combinatorial resultant decomposition (CR-decomp). Any rigidity circuit on $n\geq 5$ vertices can be successively decomposed into smaller circuits, until the complete graphs $K_4$ are reached. This sequence of CR-decomps has the structure of a rooted binary tree called the combinatorial resultant tree (CR-tree). A CR-tree encodes an elimination strategy for computing circuit polynomials via Sylvester resultants. Different CR-trees lead to elimination strategies that can vary greatly in time and memory consumption. It is an open problem to establish criteria for optimal CR-trees, or at least to characterize those CR-trees that lead to good elimination strategies. In [12] we presented an algorithm for enumerating CR-trees where we give the algorithms for decomposing 3-connected rigidity circuits in polynomial time. In this paper we focus on those circuits that are not 3-connected, which we simply call 2-connected. In order to enumerate CR-decomps of 2-connected circuits $G$, a brute force exp-time search has to be performed among the subgraphs induced by the subsets of $V(G)$. This exp-time bottleneck is not present in the 3-connected case. In this paper we will argue that we do not have to account for all possible CR-decomps of 2-connected rigidity circuits to find a good elimination strategy; we only have to account for those CR-decomps that are a 2-split, all of which can be enumerated in polynomial time. We present algorithms and computational evidence in support of this heuristic.

估計/估計量 · 泛函 · Copulas · Continuity · 相互獨立的 ·

2023 年 9 月 21 日

Quantifying and estimating dependence via sensitivity of conditional distributions

Jonathan Ansari,Patrick B. Langthaler,Sebastian Fuchs,Wolfgang Trutschnig

from arxiv, 24 pages, 5 figures, 1 table

Recently established, directed dependence measures for pairs $(X,Y)$ of random variables build upon the natural idea of comparing the conditional distributions of $Y$ given $X=x$ with the marginal distribution of $Y$. They assign pairs $(X,Y)$ values in $[0,1]$, the value is $0$ if and only if $X,Y$ are independent, and it is $1$ exclusively for $Y$ being a function of $X$. Here we show that comparing randomly drawn conditional distributions with each other instead or, equivalently, analyzing how sensitive the conditional distribution of $Y$ given $X=x$ is on $x$, opens the door to constructing novel families of dependence measures $\Lambda_\varphi$ induced by general convex functions $\varphi: \mathbb{R} \rightarrow \mathbb{R}$, containing, e.g., Chatterjee's coefficient of correlation as special case. After establishing additional useful properties of $\Lambda_\varphi$ we focus on continuous $(X,Y)$, translate $\Lambda_\varphi$ to the copula setting, consider the $L^p$-version and establish an estimator which is strongly consistent in full generality. A real data example and a simulation study illustrate the chosen approach and the performance of the estimator. Complementing the afore-mentioned results, we show how a slight modification of the construction underlying $\Lambda_\varphi$ can be used to define new measures of explainability generalizing the fraction of explained variance.

近似 · 控制器 · 易處理的 · prototype · 穩健性 ·

2023 年 9 月 14 日

Guaranteed approximations of arbitrarily quantified reachability problems

Eric Goubault,Sylvie Putot

We propose an approach to compute inner and outer-approximations of the sets of values satisfying constraints expressed as arbitrarily quantified formulas. Such formulas arise for instance when specifying important problems in control such as robustness, motion planning or controllers comparison. We propose an interval-based method which allows for tractable but tight approximations. We demonstrate its applicability through a series of examples and benchmarks using a prototype implementation.