国产白浆一区二区无码视频在线,欧美亚洲一区电影,免费在线观看的黄色网站,国内一级特黄大片录像,欧美一级片久久悠悠

We propose a general method for distributed Bayesian model choice, using the marginal likelihood, where a data set is split in non-overlapping subsets. These subsets are only accessed locally by individual workers and no data is shared between the workers. We approximate the model evidence for the full data set through Monte Carlo sampling from the posterior on every subset generating a model evidence per subset. The results are combined using a novel approach which corrects for the splitting using summary statistics of the generated samples. Our divide-and-conquer approach enables Bayesian model choice in the large data setting, exploiting all available information but limiting communication between workers. We derive theoretical error bounds that quantify the resulting trade-off between computational gain and loss in precision. The embarrassingly parallel nature yields important speed-ups when used on massive data sets as illustrated by our real world experiments. In addition, we show how the suggested approach can be extended to model choice within a reversible jump setting that explores multiple feature combinations within one run.

相關內容

邊緣似然函數

關注 0

邊緣似然函數 · 子采樣 · 模型選擇 · 似然 · MoDELS ·

2022 年 1 月 31 日

A subsampling approach for Bayesian model selection

Jon Lachmann,Geir Storvik,Florian Frommlet,Aliaksadr Hubin

from arxiv, 33 pages, 17 figures, tables

It is common practice to use Laplace approximations to compute marginal likelihoods in Bayesian versions of generalised linear models (GLM). Marginal likelihoods combined with model priors are then used in different search algorithms to compute the posterior marginal probabilities of models and individual covariates. This allows performing Bayesian model selection and model averaging. For large sample sizes, even the Laplace approximation becomes computationally challenging because the optimisation routine involved needs to evaluate the likelihood on the full set of data in multiple iterations. As a consequence, the algorithm is not scalable for large datasets. To address this problem, we suggest using a version of a popular batch stochastic gradient descent (BSGD) algorithm for estimating the marginal likelihood of a GLM by subsampling from the data. We further combine the algorithm with Markov chain Monte Carlo (MCMC) based methods for Bayesian model selection and provide some theoretical results on the convergence of the estimates. Finally, we report results from experiments illustrating the performance of the proposed algorithm.

Performer · 估計/估計量 · 成比例 · 結點 · Networking ·

2022 年 1 月 31 日

Communication-Efficient Distributed Multiple Testing for Large-Scale Inference

Mehrdad Pournaderi,Yu Xiang

from arxiv, Submitted to the 2022 IEEE International Symposium on Information Theory (ISIT)

The Benjamini-Hochberg (BH) procedure is a celebrated method for multiple testing with false discovery rate (FDR) control. In this paper, we consider large-scale distributed networks where each node possesses a large number of p-values and the goal is to achieve the global BH performance in a communication-efficient manner. We propose that every node performs a local test with an adjusted test size according to the (estimated) global proportion of true null hypotheses. With suitable assumptions, our method is asymptotically equivalent to the global BH procedure. Motivated by this, we develop an algorithm for star networks where each node only needs to transmit an estimate of the (local) proportion of nulls and the (local) number of p-values to the center node; the center node then broadcasts a parameter (computed based on the global estimate and test size) to the local nodes. In the experiment section, we utilize existing estimators of the proportion of true nulls and consider various settings to evaluate the performance and robustness of our method.

估計/估計量 · 邊緣化 · Integration · 查準率/準確率 · MoDELS ·

2022 年 1 月 30 日

Parametric G-computation for Compatible Indirect Treatment Comparisons with Limited Individual Patient Data

Antonio Remiro-Azócar,Anna Heath,Gianluca Baio

from arxiv, 31 pages, 4 figures, 1 Table (19 additional pages in the Supplementary Material). This is the journal version of some of the research in the working paper arXiv:2008.05951. Updated after revisions for re-submission to Research Synthesis Methods

Population adjustment methods such as matching-adjusted indirect comparison (MAIC) are increasingly used to compare marginal treatment effects when there are cross-trial differences in effect modifiers and limited patient-level data. MAIC is based on propensity score weighting, which is sensitive to poor covariate overlap and cannot extrapolate beyond the observed covariate space. Current outcome regression-based alternatives can extrapolate but target a conditional treatment effect that is incompatible in the indirect comparison. When adjusting for covariates, one must integrate or average the conditional estimate over the relevant population to recover a compatible marginal treatment effect. We propose a marginalization method based on parametric G-computation that can be easily applied where the outcome regression is a generalized linear model or a Cox model. The approach views the covariate adjustment regression as a nuisance model and separates its estimation from the evaluation of the marginal treatment effect of interest. The method can accommodate a Bayesian statistical framework, which naturally integrates the analysis into a probabilistic framework. A simulation study provides proof-of-principle and benchmarks the method's performance against MAIC and the conventional outcome regression. Parametric G-computation achieves more precise and more accurate estimates than MAIC, particularly when covariate overlap is poor, and yields unbiased marginal treatment effect estimates under no failures of assumptions. Furthermore, the marginalized regression-adjusted estimates provide greater precision and accuracy than the conditional estimates produced by the conventional outcome regression, which are systematically biased because the measure of effect is non-collapsible.

近似貝葉斯計算 · 極大值 · Weight · 核化 · Branch ·

2022 年 1 月 30 日

Approximate Bayesian Computation Based on Maxima Weighted Isolation Kernel Mapping

Iurii S. Nagornov

Motivation: The branching processes model yields unevenly stochastically distributed data that consists of sparse and dense regions. The work tries to solve the problem of a precise evaluation of a parameter for this type of model. The application of the branching processes model to cancer cell evolution has many difficulties like high dimensionality and the rare appearance of a result of interest. Moreover, we would like to solve the ambitious task of obtaining the coefficients of the model reflecting the relationship of driver genes mutations and cancer hallmarks on the basis of personal data of variant allele frequencies. Results: The Approximate Bayesian computation method based on the Isolation kernel is designed. The method includes a transformation row data to a Hilbert space (mapping) and measures the similarity between simulation points and maxima weighted Isolation kernel mapping related to the observation point. Also, we designed a heuristic algorithm to find parameter estimation without gradient calculation and dimension-independent. The advantage of the proposed machine learning method is shown for multidimensional test data as well as for an example of cancer cell evolution.

估計/估計量 · Performer · 樣本 · 可約的 · 有偏 ·

2022 年 1 月 30 日

Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance

Gabriel Okasa

from arxiv, 60 pages, 17 figures, 17 tables

Estimation of causal effects using machine learning methods has become an active research field in econometrics. In this paper, we study the finite sample performance of meta-learners for estimation of heterogeneous treatment effects under the usage of sample-splitting and cross-fitting to reduce the overfitting bias. In both synthetic and semi-synthetic simulations we find that the performance of the meta-learners in finite samples greatly depends on the estimation procedure. The results imply that sample-splitting and cross-fitting are beneficial in large samples for bias reduction and efficiency of the meta-learners, respectively, whereas full-sample estimation is preferable in small samples. Furthermore, we derive practical recommendations for application of specific meta-learners in empirical studies depending on particular data characteristics such as treatment shares and sample size.

近似貝葉斯計算 · 統計量 · 近似 · 環 · 可約的 ·

2022 年 1 月 28 日

Approximate Bayesian Computation with Domain Expert in the Loop

Ayush Bharti,Louis Filstroff,Samuel Kaski

Approximate Bayesian computation (ABC) is a popular likelihood-free inference method for models with intractable likelihood functions. As ABC methods usually rely on comparing summary statistics of observed and simulated data, the choice of the statistics is crucial. This choice involves a trade-off between loss of information and dimensionality reduction, and is often determined based on domain knowledge. However, handcrafting and selecting suitable statistics is a laborious task involving multiple trial-and-error steps. In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably. By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods. Moreover, empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.

統計量 · INFORMS · 自編碼器 · 貝葉斯推斷 · MoDELS ·

2022 年 1 月 28 日

Learning Summary Statistics for Bayesian Inference with Autoencoders

Carlo Albert,Simone Ulzega,Firat Ozdemir,Fernando Perez-Cruz,Antonietta Mira

For stochastic models with intractable likelihood functions, approximate Bayesian computation offers a way of approximating the true posterior through repeated comparisons of observations with simulated model outputs in terms of a small set of summary statistics. These statistics need to retain the information that is relevant for constraining the parameters but cancel out the noise. They can thus be seen as thermodynamic state variables, for general stochastic models. For many scientific applications, we need strictly more summary statistics than model parameters to reach a satisfactory approximation of the posterior. Therefore, we propose to use the inner dimension of deep neural network based Autoencoders as summary statistics. To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information on the noise that has been used to generate the training data. We validate the approach empirically on two types of stochastic models.

聯邦學習 · 學成 · 掩碼 · CASE · Extensibility ·

2022 年 1 月 28 日

Gradient Masked Averaging for Federated Learning

Irene Tenison,Sai Aravind Sreeramadas,Vaikkunth Mugunthan,Edouard Oyallon,Eugene Belilovsky,Irina Rish

Federated learning is an emerging paradigm that permits a large number of clients with heterogeneous data to coordinate learning of a unified global model without the need to share data amongst each other. Standard federated learning algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. However, in heterogeneous settings averaging can result in information loss and lead to poor generalization due to the bias induced by dominant clients. We hypothesize that to generalize better across non-i.i.d datasets as in FL settings, the algorithms should focus on learning the invariant mechanism that is constant while ignoring spurious mechanisms that differ across clients. Inspired from recent work in the Out-of-Distribution literature, we propose a gradient masked averaging approach for federated learning as an alternative to the standard averaging of client updates. This client update aggregation technique can be adapted as a drop-in replacement in most existing federated algorithms. We perform extensive experiments with gradient masked approach on multiple FL algorithms with in-distribution, real-world, and out-of-distribution (as the worst case scenario) test dataset and show that it provides consistent improvements, particularly in the case of heterogeneous clients.

學成 · Processing（編程語言） · 目標函數 · 增廣拉格朗日法 · 泛函 ·

2019 年 3 月 25 日

DP-ADMM: ADMM-based Distributed Learning with Differential Privacy

Zonghao Huang,Rui Hu,Yuanxiong Guo,Eric Chan-Tin,Yanmin Gong

from arxiv, under revision

Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.