清纯唯美另类亚洲欧美综合_国产精品久久一区二区三区蜜桃_亚洲产在线精品亚洲第二站_日韩欧美黄色片在线免费观看_在线免费观看操人视频_国自产精品手机在线视频香蕉_日韩电影中文字幕在线网站

Many statistical analyses, in both observational data and randomized control trials, ask: how does the outcome of interest vary with combinations of observable covariates? How do various drug combinations affect health outcomes, or how does technology adoption depend on incentives and demographics? Our goal is to partition this factorial space into ``pools'' of covariate combinations where the outcome differs across the pools (but not within a pool). Existing approaches (i) search for a single ``optimal'' partition under assumptions about the association between covariates or (ii) sample from the entire set of possible partitions. Both these approaches ignore the reality that, especially with correlation structure in covariates, many ways to partition the covariate space may be statistically indistinguishable, despite very different implications for policy or science. We develop an alternative perspective, called Rashomon Partition Sets (RPSs). Each item in the RPS partitions the space of covariates using a tree-like geometry. RPSs incorporate all partitions that have posterior values near the maximum a posteriori partition, even if they offer substantively different explanations, and do so using a prior that makes no assumptions about associations between covariates. This prior is the $\ell_0$ prior, which we show is minimax optimal. Given the RPS we calculate the posterior of any measurable function of the feature effects vector on outcomes, conditional on being in the RPS. We also characterize approximation error relative to the entire posterior and provide bounds on the size of the RPS. Simulations demonstrate this framework allows for robust conclusions relative to conventional regularization techniques. We apply our method to three empirical settings: price effects on charitable giving, chromosomal structure (telomere length), and the introduction of microfinance.

相關內容

劃分

關注 0

穩健性 · 似然 · 泛函 · 類別 · AIM ·

2024 年 5 月 24 日

Likelihood distortion and Bayesian local robustness

Antonio Di Noia,Fabrizio Ruggeri,Antonietta Mira

Robust Bayesian analysis has been mainly devoted to detecting and measuring robustness to the prior distribution. Indeed, many contributions in the literature aim to define suitable classes of priors which allow the computation of variations of quantities of interest while the prior changes within those classes. The literature has devoted much less attention to the robustness of Bayesian methods to the likelihood function due to mathematical and computational complexity, and because it is often arguably considered a more objective choice compared to the prior. In this contribution, a new approach to Bayesian local robustness to the likelihood function is proposed and extended to robustness to the prior and to both. This approach is based on the notion of distortion function introduced in the literature on risk theory, and then successfully adopted to build suitable classes of priors for Bayesian global robustness to the prior. The novel robustness measure is a local sensitivity measure that turns out to be very tractable and easy to compute for certain classes of distortion functions. Asymptotic properties are derived and numerical experiments illustrate the theory and its applicability for modelling purposes.

潛變量/隱變量 · 統計量 · 推斷 · 潛在 · MoDELS ·

2024 年 5 月 23 日

A Latent Variable Approach to Learning High-dimensional Multivariate longitudinal Data

Sze Ming Lee,Yunxiao Chen,Tony Sit

High-dimensional multivariate longitudinal data, which arise when many outcome variables are measured repeatedly over time, are becoming increasingly common in social, behavioral and health sciences. We propose a latent variable model for drawing statistical inferences on covariate effects and predicting future outcomes based on high-dimensional multivariate longitudinal data. This model introduces unobserved factors to account for the between-variable and across-time dependence and assist the prediction. Statistical inference and prediction tools are developed under a general setting that allows outcome variables to be of mixed types and possibly unobserved for certain time points, for example, due to right censoring. A central limit theorem is established for drawing statistical inferences on regression coefficients. Additionally, an information criterion is introduced to choose the number of factors. The proposed model is applied to customer grocery shopping records to predict and understand shopping behavior.

MoDELS · 潛在 · 變分分布 · 數據集 · 潛變量/隱變量 ·

2024 年 5 月 23 日

Zero-inflation in the Multivariate Poisson Lognormal Family

Bastien Batardière,Julien Chiquet,Fran?ois Gindraud,Mahendra Mariadassou

from arxiv, 27 pages including appendices. 8 figures, 1 table

Analyzing high-dimensional count data is a challenge and statistical model-based approaches provide an adequate and efficient framework that preserves explainability. The (multivariate) Poisson-Log-Normal (PLN) model is one such model: it assumes count data are driven by an underlying structured latent Gaussian variable, so that the dependencies between counts solely stems from the latent dependencies. However PLN doesn't account for zero-inflation, a feature frequently observed in real-world datasets. Here we introduce the Zero-Inflated PLN (ZIPLN) model, adding a multivariate zero-inflated component to the model, as an additional Bernoulli latent variable. The Zero-Inflation can be fixed, site-specific, feature-specific or depends on covariates. We estimate model parameters using variational inference that scales up to datasets with a few thousands variables and compare two approximations: (i) independent Gaussian and Bernoulli variational distributions or (ii) Gaussian variational distribution conditioned on the Bernoulli one. The method is assessed on synthetic data and the efficiency of ZIPLN is established even when zero-inflation concerns up to $90\%$ of the observed counts. We then apply both ZIPLN and PLN to a cow microbiome dataset, containing $90.6\%$ of zeroes. Accounting for zero-inflation significantly increases log-likelihood and reduces dispersion in the latent space, thus leading to improved group discrimination.

得分 · MoDELS · Learning · 數據集 · 可約的 ·

2024 年 5 月 23 日

Learning Multi-dimensional Human Preference for Text-to-Image Generation

Sixian Zhang,Bohan Wang,Junqiang Wu,Yan Li,Tingting Gao,Di Zhang,Zhongyuan Wang

Current metrics for text-to-image models typically rely on statistical metrics which inadequately represent the real preference of humans. Although recent work attempts to learn these preferences via human annotated images, they reduce the rich tapestry of human preference to a single overall score. However, the preference results vary when humans evaluate images with different aspects. Therefore, to learn the multi-dimensional human preferences, we propose the Multi-dimensional Preference Score (MPS), the first multi-dimensional preference scoring model for the evaluation of text-to-image models. The MPS introduces the preference condition module upon CLIP model to learn these diverse preferences. It is trained based on our Multi-dimensional Human Preference (MHP) Dataset, which comprises 918,315 human preference choices across four dimensions (i.e., aesthetics, semantic alignment, detail quality and overall assessment) on 607,541 images. The images are generated by a wide range of latest text-to-image models. The MPS outperforms existing scoring methods across 3 datasets in 4 dimensions, enabling it a promising metric for evaluating and improving text-to-image generation.

稀疏化 · Learning · 圖 · CASE · Performer ·

2024 年 5 月 22 日

Energy-efficient Decentralized Learning via Graph Sparsification

Xusheng Zhang,Cho-Chun Chiu,Ting He

from arxiv, ICASSP 2024

This work aims at improving the energy efficiency of decentralized learning by optimizing the mixing matrix, which controls the communication demands during the learning process. Through rigorous analysis based on a state-of-the-art decentralized learning algorithm, the problem is formulated as a bi-level optimization, with the lower level solved by graph sparsification. A solution with guaranteed performance is proposed for the special case of fully-connected base topology and a greedy heuristic is proposed for the general case. Simulations based on real topology and dataset show that the proposed solution can lower the energy consumption at the busiest node by 54%-76% while maintaining the quality of the trained model.

近似貝葉斯計算 · 神經元 · MoDELS · 近似 · INFORMS ·

2024 年 5 月 22 日

Calibration of stochastic, agent-based neuron growth models with Approximate Bayesian Computation

Tobias Duswald,Lukas Breitwieser,Thomas Thorne,Barbara Wohlmuth,Roman Bauer

from arxiv, 32 pages, 12 Figures

Understanding how genetically encoded rules drive and guide complex neuronal growth processes is essential to comprehending the brain's architecture, and agent-based models (ABMs) offer a powerful simulation approach to further develop this understanding. However, accurately calibrating these models remains a challenge. Here, we present a novel application of Approximate Bayesian Computation (ABC) to address this issue. ABMs are based on parametrized stochastic rules that describe the time evolution of small components -- the so-called agents -- discretizing the system, leading to stochastic simulations that require appropriate treatment. Mathematically, the calibration defines a stochastic inverse problem. We propose to address it in a Bayesian setting using ABC. We facilitate the repeated comparison between data and simulations by quantifying the morphological information of single neurons with so-called morphometrics and resort to statistical distances to measure discrepancies between populations thereof. We conduct experiments on synthetic as well as experimental data. We find that ABC utilizing Sequential Monte Carlo sampling and the Wasserstein distance finds accurate posterior parameter distributions for representative ABMs. We further demonstrate that these ABMs capture specific features of pyramidal cells of the hippocampus (CA1). Overall, this work establishes a robust framework for calibrating agent-based neuronal growth models and opens the door for future investigations using Bayesian techniques for model building, verification, and adequacy assessment.

高斯過程回歸 · Processing（編程語言） · 行人重識別 · MoDELS · 多樣性 ·

2024 年 5 月 22 日

Density-based User Representation using Gaussian Process Regression for Multi-interest Personalized Retrieval

Haolun Wu,Ofer Meshi,Masrour Zoghi,Fernando Diaz,Xue Liu,Craig Boutilier,Maryam Karimzadehgan

from arxiv, 22 pages

Accurate modeling of the diverse and dynamic interests of users remains a significant challenge in the design of personalized recommender systems. Existing user modeling methods, like single-point and multi-point representations, have limitations w.r.t.\ accuracy, diversity, and adaptability. To overcome these deficiencies, we introduce density-based user representations (DURs), a novel method that leverages Gaussian process regression (GPR) for effective multi-interest recommendation and retrieval. Our approach, GPR4DUR, exploits DURs to capture user interest variability without manual tuning, incorporates uncertainty-awareness, and scales well to large numbers of users. Experiments using real-world offline datasets confirm the adaptability and efficiency of GPR4DUR, while online experiments with simulated users demonstrate its ability to address the exploration-exploitation trade-off by effectively utilizing model uncertainty.

contrastive · Learning · INFORMS · 原點 · MoDELS ·

2024 年 5 月 22 日

Diffusion-based Contrastive Learning for Sequential Recommendation

Ziqiang Cui,Haolun Wu,Bowei He,Ji Cheng,Chen Ma

Self-supervised contrastive learning, which directly extracts inherent data correlations from unlabeled data, has been widely utilized to mitigate the data sparsity issue in sequential recommendation. The majority of existing methods create different augmented views of the same user sequence via random augmentation, and subsequently minimize their distance in the embedding space to enhance the quality of user representations. However, random augmentation often disrupts the semantic information and interest evolution pattern inherent in the user sequence, leading to the generation of semantically distinct augmented views. Promoting similarity of these semantically diverse augmented sequences can render the learned user representations insensitive to variations in user preferences and interest evolution, contradicting the core learning objectives of sequential recommendation. To address this issue, we leverage the inherent characteristics of sequential recommendation and propose the use of context information to generate more reasonable augmented positive samples. Specifically, we introduce a context-aware diffusion-based contrastive learning method for sequential recommendation. Given a user sequence, our method selects certain positions and employs a context-aware diffusion model to generate alternative items for these positions with the guidance of context information. These generated items then replace the corresponding original items, creating a semantically consistent augmented view of the original sequence. Additionally, to maintain representation cohesion, item embeddings are shared between the diffusion model and the recommendation model, and the entire framework is trained in an end-to-end manner. Extensive experiments on five benchmark datasets demonstrate the superiority of our proposed method.

統計量 · 近似 · INFORMS · MoDELS · 閉式 ·

2024 年 5 月 22 日

Approximation and bounding techniques for the Fisher-Rao distances between parametric statistical models

Frank Nielsen

from arxiv, 47 pages

The Fisher-Rao distance between two probability distributions of a statistical model is defined as the Riemannian geodesic distance induced by the Fisher information metric. In order to calculate the Fisher-Rao distance in closed-form, we need (1) to elicit a formula for the Fisher-Rao geodesics, and (2) to integrate the Fisher length element along those geodesics. We consider several numerically robust approximation and bounding techniques for the Fisher-Rao distances: First, we report generic upper bounds on Fisher-Rao distances based on closed-form 1D Fisher-Rao distances of submodels. Second, we describe several generic approximation schemes depending on whether the Fisher-Rao geodesics or pregeodesics are available in closed-form or not. In particular, we obtain a generic method to guarantee an arbitrarily small additive error on the approximation provided that Fisher-Rao pregeodesics and tight lower and upper bounds are available. Third, we consider the case of Fisher metrics being Hessian metrics, and report generic tight upper bounds on the Fisher-Rao distances using techniques of information geometry. Uniparametric and biparametric statistical models always have Fisher Hessian metrics, and in general a simple test allows to check whether the Fisher information matrix yields a Hessian metric or not. Fourth, we consider elliptical distribution families and show how to apply the above techniques to these models. We also propose two new distances based either on the Fisher-Rao lengths of curves serving as proxies of Fisher-Rao geodesics, or based on the Birkhoff/Hilbert projective cone distance. Last, we consider an alternative group-theoretic approach for statistical transformation models based on the notion of maximal invariant which yields insights on the structures of the Fisher-Rao distance formula which may be used fruitfully in applications.

MoDELS · 3D · bulk · 離散化 · MASS ·

2024 年 5 月 21 日

Adaptive coupling of 3D and 2D fluid flow models

Pratik Suchde

Similar to the notion of h-adaptivity, where the discretization resolution is adaptively changed, I propose the notion of model adaptivity, where the underlying model (the governing equations) is adaptively changed in space and time. Specifically, this work introduces a hybrid and adaptive coupling of a 3D bulk fluid flow model with a 2D thin film flow model. As a result, this work extends the applicability of existing thin film flow models to complex scenarios where, for example, bulk flow develops into thin films after striking a surface. At each location in space and time, the proposed framework automatically decides whether a 3D model or a 2D model must be applied. Using a meshless approach for both 3D and 2D models, at each particle, the decision to apply a 2D or 3D model is based on the user-prescribed resolution and a local principal component analysis. When a particle needs to be changed from a 3D model to 2D, or vice versa, the discretization is changed, and all relevant data mapping is done on-the-fly. Appropriate two-way coupling conditions and mass conservation considerations between the 3D and 2D models are also developed. Numerical results show that this model adaptive framework shows higher flexibility and compares well against finely resolved 3D simulations. In an actual application scenario, a 3 factor speed up is obtained, while maintaining the accuracy of the solution.