The recognition that personalised treatment decisions lead to better clinical outcomes has sparked recent research activity in the following two domains. Policy learning focuses on finding optimal treatment rules (OTRs), which express whether an individual would be better off with or without treatment, given their measured characteristics. OTRs optimize a pre-set population criterion, but do not provide insight into the extent to which treatment benefits or harms individual subjects. Estimates of conditional average treatment effects (CATEs) do offer such insights, but valid inference is currently difficult to obtain when data-adaptive methods are used. Moreover, clinicians are (rightly) hesitant to blindly adopt OTR or CATE estimates, not least since both may represent complicated functions of patient characteristics that provide little insight into the key drivers of heterogeneity. To address these limitations, we introduce novel nonparametric treatment effect variable importance measures (TE-VIMs). TE-VIMs extend recent regression-VIMs, viewed as nonparametric analogues to ANOVA statistics. By not being tied to a particular model, they are amenable to data-adaptive (machine learning) estimation of the CATE, itself an active area of research. Estimators for the proposed statistics are derived from their efficient influence curves and these are illustrated through a simulation study and an applied example.
Robust inferential methods based on divergences measures have shown an appealing trade-off between efficiency and robustness in many different statistical models. In this paper, minimum density power divergence estimators (MDPDEs) for the scale and shape parameters of the log-logistic distribution are considered. The log-logistic is a versatile distribution modeling lifetime data which is commonly adopted in survival analysis and reliability engineering studies when the hazard rate is initially increasing but then it decreases after some point. Further, it is shown that the classical estimators based on maximum likelihood (MLE) are included as a particular case of the MDPDE family. Moreover, the corresponding influence function of the MDPDE is obtained, and its boundlessness is proved, thus leading to robust estimators. A simulation study is carried out to illustrate the slight loss in efficiency of MDPDE with respect to MLE and, at besides, the considerable gain in robustness.
Monitoring the correctness of distributed cyber-physical systems is essential. Detecting possible safety violations can be hard when some samples are uncertain or missing. We monitor here black-box cyber-physical system, with logs being uncertain both in the state and timestamp dimensions: that is, not only the logged value is known with some uncertainty, but the time at which the log was made is uncertain too. In addition, we make use of an over-approximated yet expressive model, given by a non-linear extension of dynamical systems. Given an offline log, our approach is able to monitor the log against safety specifications with a limited number of false alarms. As a second contribution, we show that our approach can be used online to minimize the number of sample triggers, with the aim at energetic efficiency. We apply our approach to three benchmarks, an anesthesia model, an adaptive cruise controller and an aircraft orbiting system.
Guided ultrasonic wave based structural health monitoring has been of interest over decades. However, the influence of pre-stress states on the propagation of Lamb waves in thin-walled structures is not fully covered, yet. So far experimental work presented in the literature only focuses on a few individual frequencies, which does not allow a comprehensive verification of the numerous numerical investigations. Furthermore, most work is based on the strain-energy density function by Murnaghan. To validate the common modeling approach and to investigate the suitability of other non-linear strain-energy density functions an extensive experimental and numerical investigation covering a large frequency range is presented here. The numerical simulation comprises the use of the Neo-Hooke as well as the Murnaghan material model. It is found that these two material models show qualitatively similar results. Furthermore, the comparison with the experimental results reveals, that the Neo-Hooke material model reproduces the effect of pre-stress on the difference in the Lamb wave phase velocity very well in most cases. For the $A_0$ wave mode at higher frequencies, however, the sign of this difference is only correctly predicted by the Murnaghan model. In contrast to this the Murnaghan material model fails to predict the sign change for the $S_0$ wave mode.
The solution to a stochastic optimal control problem can be determined by computing the value function from a discretisation of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and reverse time SDEs and their associated Fokker-Planck equations. This approach is closely related to techniques used in score generative models. Forward and reverse time formulations express the value function as the ratio of two probability density functions; one stemming from a forward McKean-Vlasov SDE and another one from a reverse McKean-Vlasov SDE. In this note, we extend this approach to a more general class of stochastic optimal control problems and combine it with ensemble Kalman filter type and diffusion map approximation techniques in order to obtain efficient and robust particle-based algorithms.
The Horvitz-Thompson (H-T) estimator is widely used for estimating various types of average treatment effects under network interference. We systematically investigate the optimality properties of H-T estimator under network interference, by embedding it in the class of all linear estimators. In particular, we show that in presence of any kind of network interference, H-T estimator is in-admissible in the class of all linear estimators when using a completely randomized and a Bernoulli design. We also show that the H-T estimator becomes admissible under certain restricted randomization schemes termed as ``fixed exposure designs''. We give examples of such fixed exposure designs. It is well known that the H-T estimator is unbiased when correct weights are specified. Here, we derive the weights for unbiased estimation of various causal effects, and illustrate how they depend not only on the design, but more importantly, on the assumed form of interference (which in many real world situations is unknown at design stage), and the causal effect of interest.
Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. We show however, that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before} and after deployment are useless for decision making as they made no change in the data distribution. These results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions.
Understanding whether and how treatment effects vary across subgroups is crucial to inform clinical practice and recommendations. Accordingly, the assessment of heterogeneous treatment effects (HTE) based on pre-specified potential effect modifiers has become a common goal in modern randomized trials. However, when one or more potential effect modifiers are missing, complete-case analysis may lead to bias and under-coverage. While statistical methods for handling missing data have been proposed and compared for individually randomized trials with missing effect modifier data, few guidelines exist for the cluster-randomized setting, where intracluster correlations in the effect modifiers, outcomes, or even missingness mechanisms may introduce further threats to accurate assessment of HTE. In this article, the performance of several missing data methods are compared through a simulation study of cluster-randomized trials with continuous outcome and missing binary effect modifier data, and further illustrated using real data from the Work, Family, and Health Study. Our results suggest that multilevel multiple imputation (MMI) and Bayesian MMI have better performance than other available methods, and that Bayesian MMI has lower bias and closer to nominal coverage than standard MMI when there are model specification or compatibility issues.
In a longitudinal clinical registry, different measurement instruments might have been used for assessing individuals at different time points. To combine them, we investigate deep learning techniques for obtaining a joint latent representation, to which the items of different measurement instruments are mapped. This corresponds to domain adaptation, an established concept in computer science for image data. Using the proposed approach as an example, we evaluate the potential of domain adaptation in a longitudinal cohort setting with a rather small number of time points, motivated by an application with different motor function measurement instruments in a registry of spinal muscular atrophy (SMA) patients. There, we model trajectories in the latent representation by ordinary differential equations (ODEs), where person-specific ODE parameters are inferred from baseline characteristics. The goodness of fit and complexity of the ODE solutions then allows to judge the measurement instrument mappings. We subsequently explore how alignment can be improved by incorporating corresponding penalty terms into model fitting. To systematically investigate the effect of differences between measurement instruments, we consider several scenarios based on modified SMA data, including scenarios where a mapping should be feasible in principle and scenarios where no perfect mapping is available. While misalignment increases in more complex scenarios, some structure is still recovered, even if the availability of measurement instruments depends on patient state. A reasonable mapping is feasible also in the more complex real SMA dataset. These results indicate that domain adaptation might be more generally useful in statistical modeling for longitudinal registry data.
In the field of medical imaging, the scarcity of large-scale datasets due to privacy restrictions stands as a significant barrier to develop large models for medical. To address this issue, we introduce SynFundus-1M, a high-quality synthetic dataset with over 1 million retinal fundus images and extensive disease and pathologies annotations, which is generated by a Denoising Diffusion Probabilistic Model. The SynFundus-Generator and SynFundus-1M achieve superior Frechet Inception Distance (FID) scores compared to existing methods on main-stream public real datasets. Furthermore, the ophthalmologists evaluation validate the difficulty in discerning these synthetic images from real ones, confirming the SynFundus-1M's authenticity. Through extensive experiments, we demonstrate that both CNN and ViT can benifit from SynFundus-1M by pretraining or training directly. Compared to datasets like ImageNet or EyePACS, models train on SynFundus-1M not only achieve better performance but also faster convergence on various downstream tasks.
This research aims to take advantage of artificial intelligence techniques in producing students assessment that is compatible with the different academic accreditations of the same program. The possibility of using generative artificial intelligence technology was studied to produce an academic accreditation compliant test the National Center for Academic Accreditation of Kingdom of Saudi Arabia and Accreditation Board for Engineering and Technology. A novel method was introduced to map the verbs used to create the questions introduced in the tests. The method allows a possibility of using the generative artificial intelligence technology to produce and check the validity of questions that measure educational outcomes. A questionnaire was distributed to ensure that the use of generative artificial intelligence to create exam questions is acceptable by the faculty members, as well as to ask about the acceptance of assistance in validating questions submitted by faculty members and amending them in accordance with academic accreditations. The questionnaire was distributed to faculty members of different majors in the Kingdom of Saudi Arabias universities. one hundred twenty responses obtained with eight five percentile approval percentage for generate complete exam questions by generative artificial intelligence . Whereas ninety eight percentage was the approval percentage for editing and improving already existed questions.