The construction of coherent prediction models holds great importance in medical research as such models enable health researchers to gain deeper insights into disease epidemiology and clinicians to identify patients at higher risk of adverse outcomes. One commonly employed approach to developing prediction models is variable selection through penalized regression techniques. Integrating natural variable structures into this process not only enhances model interpretability but can also %increase the likelihood of recovering the true underlying model and boost prediction accuracy. However, a challenge lies in determining how to effectively integrate potentially complex selection dependencies into the penalized regression. In this work, we demonstrate how to represent selection dependencies mathematically, provide algorithms for deriving the complete set of potential models, and offer a structured approach for integrating complex rules into variable selection through the latent overlapping group Lasso. To illustrate our methodology, we applied these techniques to construct a coherent prediction model for major bleeding in hypertensive patients recently hospitalized for atrial fibrillation and subsequently prescribed oral anticoagulants. In this application, we account for a proxy of anticoagulant adherence and its interaction with dosage and the type of oral anticoagulants in addition to drug-drug interactions.
Prediction models are increasingly proposed for guiding treatment decisions, but most fail to address the special role of treatments, leading to inappropriate use. This paper highlights the limitations of using standard prediction models for treatment decision support. We identify 'causal blind spots' in three common approaches to handling treatments in prediction modelling and illustrate potential harmful consequences in several medical applications. We advocate for an extension of guidelines for development, reporting, clinical evaluation and monitoring of prediction models to ensure that the intended use of the model is matched to an appropriate risk estimand. For decision support this requires a shift towards developing predictions under the specific treatment options under consideration ('predictions under interventions'). We argue that this will improve the efficacy of prediction models in guiding treatment decisions and prevent potential negative effects on patient outcomes.
Recent technological advancements in artificial intelligence and computer vision have enabled gait analysis on portable devices such as cell phones. However, most state-of-the-art vision-based systems still impose numerous constraints for capturing a patient's video, such as using a static camera and maintaining a specific distance from it. While these constraints are manageable under professional observation, they pose challenges in home settings. Another issue with most vision-based systems is their output, typically a classification label and confidence value, whose reliability is often questioned by medical professionals. This paper addresses these challenges by presenting a novel system for gait analysis robust to camera movements and providing explanations for its output. The study utilizes a dataset comprising videos of subjects wearing two types of Knee Ankle Foot Orthosis (KAFO), namely "Locked Knee" and "Semi-flexion," for mobility, along with metadata and ground truth for explanations. The ground truth highlights the statistical significance of seven features captured using motion capture systems to differentiate between the two gaits. To address camera movement challenges, the proposed system employs super-resolution and pose estimation during pre-processing. It then identifies the seven features - Stride Length, Step Length and Duration of single support of orthotic and non-orthotic leg, Cadence, and Speed - using the skeletal output of pose estimation. These features train a multi-layer perceptron, with its output explained by highlighting the features' contribution to classification. While most state-of-the-art systems struggle with processing the video or training on the proposed dataset, our system achieves an average accuracy of 94%. The model's explainability is validated using ground truth and can be considered reliable.
Operator-based neural network architectures such as DeepONets have emerged as a promising tool for the surrogate modeling of physical systems. In general, towards operator surrogate modeling, the training data is generated by solving the PDEs using techniques such as Finite Element Method (FEM). The computationally intensive nature of data generation is one of the biggest bottleneck in deploying these surrogate models for practical applications. In this study, we propose a novel methodology to alleviate the computational burden associated with training data generation for DeepONets. Unlike existing literature, the proposed framework for data generation does not use any partial differential equation integration strategy, thereby significantly reducing the computational cost associated with generating training dataset for DeepONet. In the proposed strategy, first, the output field is generated randomly, satisfying the boundary conditions using Gaussian Process Regression (GPR). From the output field, the input source field can be calculated easily using finite difference techniques. The proposed methodology can be extended to other operator learning methods, making the approach widely applicable. To validate the proposed approach, we employ the heat equations as the model problem and develop the surrogate model for numerous boundary value problems.
In the past four decades, research on count time series has made significant progress, but research on $\mathbb{Z}$-valued time series is relatively rare. Existing $\mathbb{Z}$-valued models are mainly of autoregressive structure, where the use of the rounding operator is very natural. Because of the discontinuity of the rounding operator, the formulation of the corresponding model identifiability conditions and the computation of parameter estimators need special attention. It is also difficult to derive closed-form formulae for crucial stochastic properties. We rediscover a stochastic rounding operator, referred to as mean-preserving rounding, which overcomes the above drawbacks. Then, a novel class of $\mathbb{Z}$-valued ARMA models based on the new operator is proposed, and the existence of stationary solutions of the models is established. Stochastic properties including closed-form formulae for (conditional) moments, autocorrelation function, and conditional distributions are obtained. The advantages of our novel model class compared to existing ones are demonstrated. In particular, our model construction avoids identifiability issues such that maximum likelihood estimation is possible. A simulation study is provided, and the appealing performance of the new models is shown by several real-world data sets.
While score-based generative models (SGMs) have achieved remarkable success in enormous image generation tasks, their mathematical foundations are still limited. In this paper, we analyze the approximation and generalization of SGMs in learning a family of sub-Gaussian probability distributions. We introduce a notion of complexity for probability distributions in terms of their relative density with respect to the standard Gaussian measure. We prove that if the log-relative density can be locally approximated by a neural network whose parameters can be suitably bounded, then the distribution generated by empirical score matching approximates the target distribution in total variation with a dimension-independent rate. We illustrate our theory through examples, which include certain mixtures of Gaussians. An essential ingredient of our proof is to derive a dimension-free deep neural network approximation rate for the true score function associated with the forward process, which is interesting in its own right.
Dose-finding trials are a key component of the drug development process and rely on a statistical design to help inform dosing decisions. Triallists wishing to choose a design require knowledge of operating characteristics of competing methods. This is often assessed using a large-scale simulation study with multiple designs and configurations investigated, which can be time-consuming and therefore limits the scope of the simulation. We introduce a new approach to the design of simulation studies of dose-finding trials. The approach simulates all potential outcomes that individuals could experience at each dose level in the trial. Datasets are simulated in advance and then the same datasets are applied to each of the competing methods to enable a more efficient head-to-head comparison. In two case-studies we show sizeable reductions in Monte Carlo error for comparing a performance metric between two competing designs. Efficiency gains depend on the similarity of the designs. Comparing two Phase I/II design variants, with high correlation of recommending the same optimal biologic dose, we show that the new approach requires a simulation study that is approximately 30 times smaller than the conventional approach. Furthermore, advance-simulated trial datasets can be reused to assess the performance of designs across multiple configurations. We recommend researchers consider this more efficient simulation approach in their dose-finding studies and we have updated the R package escalation to help facilitate implementation.
Adaptive enrichment allows for pre-defined patient subgroups of interest to be investigated throughout the course of a clinical trial. Many trials which measure a long-term time-to-event endpoint often also routinely collect repeated measures on biomarkers which may be predictive of the primary endpoint. Although these data may not be leveraged directly to support subgroup selection decisions and early stopping decisions, we aim to make greater use of these data to increase efficiency and improve interim decision making. In this work, we present a joint model for longitudinal and time-to-event data and two methods for creating standardised statistics based on this joint model. We can use the estimates to define enrichment rules and efficacy and futility early stopping rules for a flexible efficient clinical trial with possible enrichment. Under this framework, we show asymptotically that the familywise error rate is protected in the strong sense. To assess the results, we consider a trial for the treatment of metastatic breast cancer where repeated ctDNA measurements are available and the subgroup criteria is defined by patients' ER and HER2 status. Using simulation, we show that incorporating biomarker information leads to accurate subgroup identification and increases in power.
We propose a novel algorithm for the support estimation of partially known Gaussian graphical models that incorporates prior information about the underlying graph. In contrast to classical approaches that provide a point estimate based on a maximum likelihood or a maximum a posteriori criterion using (simple) priors on the precision matrix, we consider a prior on the graph and rely on annealed Langevin diffusion to generate samples from the posterior distribution. Since the Langevin sampler requires access to the score function of the underlying graph prior, we use graph neural networks to effectively estimate the score from a graph dataset (either available beforehand or generated from a known distribution). Numerical experiments demonstrate the benefits of our approach.
Mendelian randomization uses genetic variants as instrumental variables to make causal inferences about the effects of modifiable risk factors on diseases from observational data. One of the major challenges in Mendelian randomization is that many genetic variants are only modestly or even weakly associated with the risk factor of interest, a setting known as many weak instruments. Many existing methods, such as the popular inverse-variance weighted (IVW) method, could be biased when the instrument strength is weak. To address this issue, the debiased IVW (dIVW) estimator, which is shown to be robust to many weak instruments, was recently proposed. However, this estimator still has non-ignorable bias when the effective sample size is small. In this paper, we propose a modified debiased IVW (mdIVW) estimator by multiplying a modification factor to the original dIVW estimator. After this simple correction, we show that the bias of the mdIVW estimator converges to zero at a faster rate than that of the dIVW estimator under some regularity conditions. Moreover, the mdIVW estimator has smaller variance than the dIVW estimator.We further extend the proposed method to account for the presence of instrumental variable selection and balanced horizontal pleiotropy. We demonstrate the improvement of the mdIVW estimator over the dIVW estimator through extensive simulation studies and real data analysis.
Finite sample inference for Cox models is an important problem in many settings, such as clinical trials. Bayesian procedures provide a means for finite sample inference and incorporation of prior information if MCMC algorithms and posteriors are well behaved. On the other hand, estimation procedures should also retain inferential properties in high dimensional settings. In addition, estimation procedures should be able to incorporate constraints and multilevel modeling such as cure models and frailty models in a straightforward manner. In order to tackle these modeling challenges, we propose a uniformly ergodic Gibbs sampler for a broad class of convex set constrained multilevel Cox models. We develop two key strategies. First, we exploit a connection between Cox models and negative binomial processes through the Poisson process to reduce Bayesian computation to iterative Gaussian sampling. Next, we appeal to sufficient dimension reduction to address the difficult computation of nonparametric baseline hazards, allowing for the collapse of the Markov transition operator within the Gibbs sampler based on sufficient statistics. We demonstrate our approach using open source data and simulations.