The Cox model is an indispensable tool for time-to-event analysis, particularly in biomedical research. However, medicine is undergoing a profound transformation, generating data at an unprecedented scale, which opens new frontiers to study and understand diseases. With the wealth of data collected, new challenges for statistical inference arise, as datasets are often high dimensional, exhibit an increasing number of measurements at irregularly spaced time points, and are simply too large to fit in memory. Many current implementations for time-to-event analysis are ill-suited for these problems as inference is computationally demanding and requires access to the full data at once. Here we propose a Bayesian version for the counting process representation of Cox's partial likelihood for efficient inference on large-scale datasets with millions of data points and thousands of time-dependent covariates. Through the combination of stochastic variational inference and a reweighting of the log-likelihood, we obtain an approximation for the posterior distribution that factorizes over subsamples of the data, enabling the analysis in big data settings. Crucially, the method produces viable uncertainty estimates for large-scale and high-dimensional datasets. We show the utility of our method through a simulation study and an application to myocardial infarction in the UK Biobank.
Measuring and evaluating network resilience has become an important aspect since the network is vulnerable to both uncertain disturbances and malicious attacks. Networked systems are often composed of many dynamic components and change over time, which makes it difficult for existing methods to access the changeable situation of network resilience. This paper establishes a novel quantitative framework for evaluating network resilience using the Dynamic Bayesian Network. The proposed framework can be used to evaluate the network's multi-stage resilience processes when suffering various attacks and recoveries. First, we define the dynamic capacities of network components and establish the network's five core resilience capabilities to describe the resilient networking stages including preparation, resistance, adaptation, recovery, and evolution; the five core resilience capabilities consist of rapid response capability, sustained resistance capability, continuous running capability, rapid convergence capability, and dynamic evolution capability. Then, we employ a two-time slices approach based on the Dynamic Bayesian Network to quantify five crucial performances of network resilience based on core capabilities proposed above. The proposed approach can ensure the time continuity of resilience evaluation in time-varying networks. Finally, our proposed evaluation framework is applied to different attacks and recovery conditions in typical simulations and real-world network topology. Results and comparisons with extant studies indicate that the proposed method can achieve a more accurate and comprehensive evaluation and can be applied to network scenarios under various attack and recovery intensities.
We use autoregressive hidden Markov models and a time-frequency approach to create meaningful quantitative descriptions of behavioral characteristics of cerebellar ataxias from wearable inertial sensor data gathered during movement. Wearable sensor data is relatively easily collected and provides direct measurements of movement that can be used to develop useful behavioral biomarkers. Sensitive and specific behavioral biomarkers for neurodegenerative diseases are critical to supporting early detection, drug development efforts, and targeted treatments. We create a flexible and descriptive set of features derived from accelerometer and gyroscope data collected from wearable sensors while participants perform clinical assessment tasks, and with them estimate disease status and severity. A short period of data collection ($<$ 5 minutes) yields enough information to effectively separate patients with ataxia from healthy controls with very high accuracy, to separate ataxia from other neurodegenerative diseases such as Parkinson's disease, and to give estimates of disease severity.
This paper surveys an important class of methods that combine iterative projection methods and variational regularization methods for large-scale inverse problems. Iterative methods such as Krylov subspace methods are invaluable in the numerical linear algebra community and have proved important in solving inverse problems due to their inherent regularizing properties and their ability to handle large-scale problems. Variational regularization describes a broad and important class of methods that are used to obtain reliable solutions to inverse problems, whereby one solves a modified problem that incorporates prior knowledge. Hybrid projection methods combine iterative projection methods with variational regularization techniques in a synergistic way, providing researchers with a powerful computational framework for solving very large inverse problems. Although the idea of a hybrid Krylov method for linear inverse problems goes back to the 1980s, several recent advances on new regularization frameworks and methodologies have made this field ripe for extensions, further analyses, and new applications. In this paper, we provide a practical and accessible introduction to hybrid projection methods in the context of solving large (linear) inverse problems.
Causal inference in longitudinal observational health data often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-varying covariates. To tackle this sequential treatment effect estimation problem, we have developed a causal dynamic survival (CDS) model that uses the potential outcomes framework with the recurrent sub-networks with random seed ensembles to estimate the difference in survival curves of its confidence interval. Using simulated survival datasets, the CDS model has shown good causal effect estimation performance across scenarios of sample dimension, event rate, confounding and overlapping. However, increasing the sample size is not effective to alleviate the adverse impact from high level of confounding. In two large clinical cohort studies, our model identified the expected conditional average treatment effect and detected individual effect heterogeneity over time and patient subgroups. CDS provides individualised absolute treatment effect estimations to improve clinical decisions.
We propose a residual randomization procedure designed for robust Lasso-based inference in the high-dimensional setting. Compared to earlier work that focuses on sub-Gaussian errors, the proposed procedure is designed to work robustly in settings that also include heavy-tailed covariates and errors. Moreover, our procedure can be valid under clustered errors, which is important in practice, but has been largely overlooked by earlier work. Through extensive simulations, we illustrate our method's wider range of applicability as suggested by theory. In particular, we show that our method outperforms state-of-art methods in challenging, yet more realistic, settings where the distribution of covariates is heavy-tailed or the sample size is small, while it remains competitive in standard, "well behaved" settings previously studied in the literature.
Time-to-event endpoints show an increasing popularity in phase II cancer trials. The standard statistical tool for such endpoints in one-armed trials is the one-sample log-rank test. It is widely known, that the asymptotic providing the correctness of this test does not come into effect to full extent for small sample sizes. There have already been some attempts to solve this problem. While some do not allow easy power and sample size calculations, others lack a clear theoretical motivation and require further considerations. The problem itself can partly be attributed to the dependence of the compensated counting process and its variance estimator. We provide a framework in which the variance estimator can be flexibly adopted to the present situation while maintaining its asymptotical properties. We exemplarily suggest a variance estimator which is uncorrelated to the compensated counting process. Furthermore, we provide sample size and power calculations for any approach fitting into our framework. Finally, we compare several methods via simulation studies and the hypothetical setup of a Phase II trial based on real world data.
Causal inference in longitudinal observational health data often requires the accurate estimation of treatment effects on time-to-event outcomes in the presence of time-varying covariates. To tackle this sequential treatment effect estimation problem, we have developed a causal dynamic survival (CDS) model that uses the potential outcomes framework with the recurrent sub-networks with random seed ensembles to estimate the difference in survival curves of its confidence interval. Using simulated survival datasets, the CDS model has shown good causal effect estimation performance across scenarios of sample dimension, event rate, confounding and overlapping. However, increasing the sample size is not effective to alleviate the adverse impact from high level of confounding. In two large clinical cohort studies, our model identified the expected conditional average treatment effect and detected individual effect heterogeneity over time and patient subgroups. CDS provides individualised absolute treatment effect estimations to improve clinical decisions.
Minimization of a stochastic cost function is commonly used for approximate sampling in high-dimensional Bayesian inverse problems with Gaussian prior distributions and multimodal posterior distributions. The density of the samples generated by minimization is not the desired target density, unless the observation operator is linear, but the distribution of samples is useful as a proposal density for importance sampling or for Markov chain Monte Carlo methods. In this paper, we focus on applications to sampling from multimodal posterior distributions in high dimensions. We first show that sampling from multimodal distributions is improved by computing all critical points instead of only minimizers of the objective function. For applications to high-dimensional geoscience problems, we demonstrate an efficient approximate weighting that uses a low-rank Gauss-Newton approximation of the determinant of the Jacobian. The method is applied to two toy problems with known posterior distributions and a Darcy flow problem with multiple modes in the posterior.
Generative models (GMs) such as Generative Adversary Network (GAN) and Variational Auto-Encoder (VAE) have thrived these years and achieved high quality results in generating new samples. Especially in Computer Vision, GMs have been used in image inpainting, denoising and completion, which can be treated as the inference from observed pixels to corrupted pixels. However, images are hierarchically structured which are quite different from many real-world inference scenarios with non-hierarchical features. These inference scenarios contain heterogeneous stochastic variables and irregular mutual dependences. Traditionally they are modeled by Bayesian Network (BN). However, the learning and inference of BN model are NP-hard thus the number of stochastic variables in BN is highly constrained. In this paper, we adapt typical GMs to enable heterogeneous learning and inference in polynomial time.We also propose an extended autoregressive (EAR) model and an EAR with adversary loss (EARA) model and give theoretical results on their effectiveness. Experiments on several BN datasets show that our proposed EAR model achieves the best performance in most cases compared to other GMs. Except for black box analysis, we've also done a serial of experiments on Markov border inference of GMs for white box analysis and give theoretical results.
Amortized inference has led to efficient approximate inference for large datasets. The quality of posterior inference is largely determined by two factors: a) the ability of the variational distribution to model the true posterior and b) the capacity of the recognition network to generalize inference over all datapoints. We analyze approximate inference in variational autoencoders in terms of these factors. We find that suboptimal inference is often due to amortizing inference rather than the limited complexity of the approximating distribution. We show that this is due partly to the generator learning to accommodate the choice of approximation. Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.