国产一区二区高清无码-国产日韩精品在线观看

In this paper I describe some substantial extensions to the survsim command for simulating survival data. survsim can now simulate survival data from a parametric distribution, a custom/user-defined distribution, from a fitted merlin model, from a specified cause-specific hazards competing risks model, or from a specified general multi-state model (with multiple timescales). Left truncation (delayed entry) is now also available for all settings. I illustrate the command with some examples, demonstrating the huge flexibility that can be used to better evaluate statistical methods.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · TOOLS · 評論員 · Networks · 在線 ·

2021 年 12 月 17 日

The Values and Limits of Altmetrics

Grischa Fraumann

from arxiv, This is the peer reviewed version of the following article: Fraumann, Grischa (2018). The Values and Limits of Altmetrics. New Directions for Institutional Research 2018: 53-69, which has been published in final form at //doi.org/10.1002/ir.20267. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving

Altmetrics are tools for measuring the impact of research beyond scientific communities. In general, they measure online mentions of scholarly outputs, such as on online social networks, blogs, and news sites. Some stakeholders in higher education have championed altmetrics as a new way to understand research impact and as an alternative or supplement to bibliometrics. Contrastingly, others have criticized altmetrics for being ill conceived and limited in their use. This chapter explores the values and limits of altmetrics, including their role in evaluating, promoting, and disseminating research.

矩 · 蒙特卡羅積分 · 廣義函數 · 采樣法 · 蒙特卡羅 ·

2021 年 12 月 17 日

Moments and random number generation for the truncated elliptical family of distributions

Katherine A. L. Valeriano,Christian E. Galarza,Larissa A. Matos

from arxiv, 20 pages, 7 figures and 3 tables

This paper proposes an algorithm to generate random numbers from any member of the truncated multivariate elliptical family of distributions with a strictly decreasing density generating function. Based on Neal (2003) and Ho et al. (2012), we construct an efficient sampling method by means of a slice sampling algorithm with Gibbs sampler steps. We also provide a faster approach to approximate the first and the second moment for the truncated multivariate elliptical distributions where Monte Carlo integration is used for the truncated partition, and explicit expressions for the non-truncated part (Galarza et al., 2020). Examples and an application to environmental spatial data illustrate its usefulness. Methods are available for free in the new R library elliptical.

估計/估計量 · contrastive · 混合 · 優化器 · 線性的 ·

2021 年 12 月 16 日

Nonparametric empirical Bayes estimation based on generalized Laguerre series

Rida Benhaddou,Matthew Connell

from arxiv, 30 pages

In this work, we delve into the nonparametric empirical Bayes theory and approximate the classical Bayes estimator by a truncation of the generalized Laguerre series and then estimate its coefficients by minimizing the prior risk of the estimator. The minimization process yields a system of linear equations the size of which is equal to the truncation level. We focus on the empirical Bayes estimation problem when the mixing distribution, and therefore the prior distribution, has a support on the positive real half-line or a subinterval of it. By investigating several common mixing distributions, we develop a strategy on how to select the parameter of the generalized Laguerre function basis so that our estimator {possesses a finite} variance. We show that our generalized Laguerre empirical Bayes approach is asymptotically optimal in the minimax sense. Finally, our convergence rate is compared and contrasted with {several} results from the literature.

INFORMS · 估計/估計量 · 統計量 · 近似 · 線性模型 ·

2021 年 12 月 16 日

Budget-limited distribution learning in multifidelity problems

Yiming Xu,Akil Narayan

from arxiv, 27 pages, added more some proofs

Multifidelity methods are widely used for estimation of quantities of interest (QoIs) in uncertainty quantification using simulation codes of differing costs and accuracies. Many methods approximate numerical-valued statistics that represent only limited information of the QoIs. In this paper, we generalize the ideas in \cite{xu2021bandit} to develop a multifidelity method that approximates the distribution of scalar-valued QoI. Under a linear model hypothesis, we propose an exploration-exploitation strategy to reconstruct the full distribution, not just statistics, of a scalar-valued QoI using samples from a subset of low-fidelity regressors. We derive an informative asymptotic bound for the mean 1-Wasserstein distance between the estimator and the true distribution, and use it to adaptively allocate computational budget for parametric estimation and non-parametric approximation of the probability distribution. Assuming the linear model is correct, we prove that such a procedure is consistent and converges to the optimal policy (and hence optimal computational budget allocation) under an upper bound criterion as the budget goes to infinity. As a corollary, we obtain convergence of the approximated distribution in the mean 1-Wasserstein metric. The major advantages of our approach are that convergence to the full distribution of the output is attained under appropriate assumptions, and that the procedure and implementation require neither a hierarchical model setup, knowledge of cross-model information or correlation, nor \textit{a priori} known model statistics. Numerical experiments are provided in the end to support our theoretical analysis.

生成模型 · GANs · 可交換的 · MoDELS · 離散化 ·

2021 年 12 月 15 日

Validation Methods for Energy Time Series Scenarios from Deep Generative Models

Eike Cramer,Leonardo Rydin Gorj?o,Alexander Mitsos,Benjamin Sch?fer,Dirk Witthaut,Manuel Dahmen

from arxiv, 20 pages, 8 figures, 2 tables

The design and operation of modern energy systems are heavily influenced by time-dependent and uncertain parameters, e.g., renewable electricity generation, load-demand, and electricity prices. These are typically represented by a set of discrete realizations known as scenarios. A popular scenario generation approach uses deep generative models (DGM) that allow scenario generation without prior assumptions about the data distribution. However, the validation of generated scenarios is difficult, and a comprehensive discussion about appropriate validation methods is currently lacking. To start this discussion, we provide a critical assessment of the currently used validation methods in the energy scenario generation literature. In particular, we assess validation methods based on probability density, auto-correlation, and power spectral density. Furthermore, we propose using the multifractal detrended fluctuation analysis (MFDFA) as an additional validation method for non-trivial features like peaks, bursts, and plateaus. As representative examples, we train generative adversarial networks (GANs), Wasserstein GANs (WGANs), and variational autoencoders (VAEs) on two renewable power generation time series (photovoltaic and wind from Germany in 2013 to 2015) and an intra-day electricity price time series form the European Energy Exchange in 2017 to 2019. We apply the four validation methods to both the historical and the generated data and discuss the interpretation of validation results as well as common mistakes, pitfalls, and limitations of the validation methods. Our assessment shows that no single method sufficiently characterizes a scenario but ideally validation should include multiple methods and be interpreted carefully in the context of scenarios over short time periods.

規范化的 · 蒙特卡羅 · 樣本 · 馬爾可夫鏈蒙特卡羅 · Processing（編程語言） ·

2021 年 12 月 15 日

IID Sampling from Doubly Intractable Distributions

Sourabh Bhattacharya

Intractable posterior distributions of parameters with intractable normalizing constants depending upon the parameters are known as doubly intractable posterior distributions. The terminology itself indicates that obtaining Bayesian inference from such posteriors is doubly difficult compared to traditional intractable posteriors where the normalizing constants are tractable and admit traditional Markov Chain Monte Carlo (MCMC) solutions. As can be anticipated, a plethora of MCMC-based methods have originated in the literature to deal with doubly intractable distributions. Yet, it remains very much unclear if any of the methods can satisfactorily sample from such posteriors, particularly in high-dimensional setups. In this article, we consider efficient Monte Carlo and importance sampling approximations of the intractable normalizing constant for a few values of the parameters, and Gaussian process interpolations for the remaining values of the parameters, using the approximations. We then incorporate this strategy within the exact iid sampling framework developed in Bhattacharya (2021a) and Bhattacharya (2021b), and illustrate the methodology with simulation experiments comprising a two-dimensional normal-gamma posterior, a two-dimensional Ising model posterior, a two-dimensional Strauss process posterior and a 100-dimensional autologistic model posterior. In each case we demonstrate great accuracy of our methodology, which is also computationally extremely efficient, often taking only a few minutes for generating 10, 000 iid realizations on 80 processors.

多峰值 · 采樣法 · 規范化的 · 樣本 · 歐氏空間 ·

2021 年 12 月 15 日

IID Sampling from Intractable Multimodal and Variable-Dimensional Distributions

Sourabh Bhattacharya

from arxiv, An updated version after fixing some typos in the paper and code

Bhattacharya (2021b) has introduced a novel methodology for generating iid realizations from any target distribution on the Euclidean space, irrespective of dimensionality. In this article, our purpose is two-fold. We first extend the method for obtaining iid realizations from general multimodal distributions, and illustrate with a mixture of two 50-dimensional normal distributions. Then we extend the iid sampling method for fixed-dimensional distributions to variable-dimensional situations and illustrate with a variable-dimensional normal mixture modeling of the well-known "acidity data", with further demonstration of the applicability of the iid sampling method developed for multimodal distributions.

歐氏空間 · 規范化的 · CASES · contrastive · Less ·

2021 年 12 月 15 日

IID Sampling from Intractable Distributions

Sourabh Bhattacharya

from arxiv, An updated version with some typos in the paper and code fixed. Now the iid and TMCMC results are in close agreement for the Challenger and the Salmonella examples

We propose a novel methodology for drawing iid realizations from any target distribution on the Euclidean space with arbitrary dimension. No assumption of compact support is necessary for the validity of our theory and method. Our idea is to construct an appropriate infinite sequence of concentric closed ellipsoids, represent the target distribution as an infinite mixture on the central ellipsoid and the ellipsoidal annuli, and to construct efficient perfect samplers for the mixture components. In contrast with most of the existing works on perfect sampling, ours is not only a theoretically valid method, it is practically applicable to all target distributions on any dimensional Euclidean space and very much amenable to parallel computation. We validate the practicality and usefulness of our methodology by generating 10000 iid realizations from the standard distributions such as normal, Student's t with 5 degrees of freedom and Cauchy, for dimensions d = 1, 5, 10, 50, 100, as well as from a 50-dimensional mixture normal distribution. The implementation time in all the cases are very reasonable, and often less than a minute in our parallel implementation. The results turned out to be highly accurate. We also apply our method to draw 10000 iid realizations from the posterior distributions associated with the well-known Challenger data, a Salmonella data and the 160-dimensional challenging spatial example of the radionuclide count data on Rongelap Island. Again, we are able to obtain quite encouraging results with very reasonable computing time.

MoDELS · Performer · Processing（編程語言） · 學成 · 穩健性 ·

2021 年 9 月 3 日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Paul Michel

from arxiv, PhD thesis

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

離散化 · 馬爾可夫鏈蒙特卡羅 · 潛在 · 可交換的 · 話題模型 ·

2018 年 1 月 15 日

Latent nested nonparametric priors

Federico Camerlenghi,David B. Dunson,Antonio Lijoi,Igor Prünster,Abel Rodríguez

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.