The effects of treatments may differ between persons with different characteristics. Addressing such treatment heterogeneity is crucial to investigate whether patients with specific characteristics are likely to benefit from a new treatment. The current paper presents a novel Bayesian method for superiority decision-making in the context of randomized controlled trials with multivariate binary responses and heterogeneous treatment effects. The framework is based on three elements: a) Bayesian multivariate logistic regression analysis with a P\'olya-Gamma expansion; b) a transformation procedure to transfer obtained regression coefficients to a more intuitive multivariate probability scale (i.e., success probabilities and the differences between them); and c) a compatible decision procedure for treatment comparison with prespecified decision error rates. Procedures for a priori sample size estimation under a non-informative prior distribution are included. A numerical evaluation demonstrated that decisions based on a priori sample size estimation resulted in anticipated error rates among the trial population as well as subpopulations. Further, average and conditional treatment effect parameters could be estimated unbiasedly when the sample was large enough. Illustration with the International Stroke Trial dataset revealed a trend towards heterogeneous effects among stroke patients: Something that would have remained undetected when analyses were limited to average treatment effects.
Accurate diagnostic tests are crucial to ensure effective treatment, screening, and surveillance of diseases. However, the limited accuracy of individual biomarkers often hinders comprehensive screening. The heterogeneity of many diseases, particularly cancer, calls for the use of several biomarkers together into a composite diagnostic test. In this paper, we present a novel multivariate model that optimally combines multiple biomarkers using the likelihood ratio function. The model's parameters directly translate into computationally simple diagnostic accuracy measures. Additionally, our method allows for reliable predictions even in scenarios where specific biomarker measurements are unavailable and can guide the selection of biomarker combinations under resource constraints. We conduct simulation studies to compare the performance to popular classification and discriminant analysis methods. We utilize the approach to construct an optimal diagnostic test for hepatocellular carcinoma, a cancer type known for the absence of a single ideal marker. An accompanying R implementation is made available for reproducing all results.
In social choice theory with ordinal preferences, a voting method satisfies the axiom of positive involvement if adding to a preference profile a voter who ranks an alternative uniquely first cannot cause that alternative to go from winning to losing. In this note, we prove a new impossibility theorem concerning this axiom: there is no ordinal voting method satisfying positive involvement that also satisfies the Condorcet winner and loser criteria, resolvability, and a common invariance property for Condorcet methods, namely that the choice of winners depends only on the ordering of majority margins by size.
Heteroskedasticity testing in nonparametric regression is a classic statistical problem with important practical applications, yet fundamental limits are unknown. Adopting a minimax perspective, this article considers the testing problem in the context of an $\alpha$-H\"{o}lder mean and a $\beta$-H\"{o}lder variance function. For $\alpha > 0$ and $\beta \in (0, 1/2)$, the sharp minimax separation rate $n^{-4\alpha} + n^{-4\beta/(4\beta+1)} + n^{-2\beta}$ is established. To achieve the minimax separation rate, a kernel-based statistic using first-order squared differences is developed. Notably, the statistic estimates a proxy rather than a natural quadratic functional (the squared distance between the variance function and its best $L^2$ approximation by a constant) suggested in previous work. The setting where no smoothness is assumed on the variance function is also studied; the variance profile across the design points can be arbitrary. Despite the lack of structure, consistent testing turns out to still be possible by using the Gaussian character of the noise, and the minimax rate is shown to be $n^{-4\alpha} + n^{-1/2}$. Exploiting noise information happens to be a fundamental necessity as consistent testing is impossible if nothing more than zero mean and unit variance is known about the noise distribution. Furthermore, in the setting where the variance function is $\beta$-H\"{o}lder but heteroskedasticity is measured only with respect to the design points, the minimax separation rate is shown to be $n^{-4\alpha} + n^{-\left((1/2) \vee (4\beta/(4\beta+1))\right)}$ when the noise is Gaussian and $n^{-4\alpha} + n^{-4\beta/(4\beta+1)} + n^{-2\beta}$ when the noise distribution is unknown.
This study focuses on how different modalities of human communication can be used to distinguish between healthy controls and subjects with schizophrenia who exhibit strong positive symptoms. We developed a multi-modal schizophrenia classification system using audio, video, and text. Facial action units and vocal tract variables were extracted as low-level features from video and audio respectively, which were then used to compute high-level coordination features that served as the inputs to the audio and video modalities. Context-independent text embeddings extracted from transcriptions of speech were used as the input for the text modality. The multi-modal system is developed by fusing a segment-to-session-level classifier for video and audio modalities with a text model based on a Hierarchical Attention Network (HAN) with cross-modal attention. The proposed multi-modal system outperforms the previous state-of-the-art multi-modal system by 8.53% in the weighted average F1 score.
We consider the problem of learning support vector machines robust to uncertainty. It has been established in the literature that typical loss functions, including the hinge loss, are sensible to data perturbations and outliers, thus performing poorly in the setting considered. In contrast, using the 0-1 loss or a suitable non-convex approximation results in robust estimators, at the expense of large computational costs. In this paper we use mixed-integer optimization techniques to derive a new loss function that better approximates the 0-1 loss compared with existing alternatives, while preserving the convexity of the learning problem. In our computational results, we show that the proposed estimator is competitive with the standard SVMs with the hinge loss in outlier-free regimes and better in the presence of outliers.
Most of the existing Mendelian randomization (MR) methods are limited by the assumption of linear causality between exposure and outcome, and the development of new non-linear MR methods is highly desirable. We introduce two-stage prediction estimation and control function estimation from econometrics to MR and extend them to non-linear causality. We give conditions for parameter identification and theoretically prove the consistency and asymptotic normality of the estimates. We compare the two methods theoretically under both linear and non-linear causality. We also extend the control function estimation to a more flexible semi-parametric framework without detailed parametric specifications of causality. Extensive simulations numerically corroborate our theoretical results. Application to UK Biobank data reveals non-linear causal relationships between sleep duration and systolic/diastolic blood pressure.
Manufacturing assembly tasks can vary in complexity and level of automation. Yet, achieving full automation can be challenging and inefficient, particularly due to the complexity of certain assembly operations. Human-robot collaborative work, leveraging the strengths of human labor alongside the capabilities of robots, can be a solution for enhancing efficiency. This paper introduces the CT benchmark, a benchmark and model set designed to facilitate the testing and evaluation of human-robot collaborative assembly scenarios. It was designed to compare manual and automatic processes using metrics such as the assembly time and human workload. The components of the model set can be assembled through the most common assembly tasks, each with varying levels of difficulty. The CT benchmark was designed with a focus on its applicability in human-robot collaborative environments, with the aim of ensuring the reproducibility and replicability of experiments. Experiments were carried out to assess assembly performance in three different setups (manual, automatic and collaborative), measuring metrics related to the assembly time and the workload on human operators. The results suggest that the collaborative approach takes longer than the fully manual assembly, with an increase of 70.8%. However, users reported a lower overall workload, as well as reduced mental demand, physical demand, and effort according to the NASA-TLX questionnaire.
Coronary artery disease (CAD) remains the leading cause of death globally and invasive coronary angiography (ICA) is considered the gold standard of anatomical imaging evaluation when CAD is suspected. However, risk evaluation based on ICA has several limitations, such as visual assessment of stenosis severity, which has significant interobserver variability. This motivates to development of a lesion classification system that can support specialists in their clinical procedures. Although deep learning classification methods are well-developed in other areas of medical imaging, ICA image classification is still at an early stage. One of the most important reasons is the lack of available and high-quality open-access datasets. In this paper, we reported a new annotated ICA images dataset, CADICA, to provide the research community with a comprehensive and rigorous dataset of coronary angiography consisting of a set of acquired patient videos and associated disease-related metadata. This dataset can be used by clinicians to train their skills in angiographic assessment of CAD severity and by computer scientists to create computer-aided diagnostic systems to help in such assessment. In addition, baseline classification methods are proposed and analyzed, validating the functionality of CADICA and giving the scientific community a starting point to improve CAD detection.
Predicting the long-term success of endovascular interventions in the clinical management of cerebral aneurysms requires detailed insight into the patient-specific physiological conditions. In this work, we not only propose numerical representations of endovascular medical devices such as coils, flow diverters or Woven EndoBridge but also outline numerical models for the prediction of blood flow patterns in the aneurysm cavity right after a surgical intervention. Detailed knowledge about the post-surgical state then lays the basis to assess the chances of a stable occlusion of the aneurysm required for a long-term treatment success. To this end, we propose mathematical and mechanical models of endovascular medical devices made out of thin metal wires. These can then be used for fully resolved flow simulations of the post-surgical blood flow, which in this work will be performed by means of a Lattice Boltzmann method applied to the incompressible Navier-Stokes equations and patient-specific geometries. To probe the suitability of homogenized models, we also investigate poro-elastic models to represent such medical devices. In particular, we examine the validity of this modeling approach for flow diverter placement across the opening of the aneurysm cavity. For both approaches, physiologically meaningful boundary conditions are provided from reduced-order models of the vascular system. The present study demonstrates our capabilities to predict the post-surgical state and lays a solid foundation to tackle the prediction of thrombus formation and, thus, the aneurysm occlusion in a next step.
Effect modification occurs when the impact of the treatment on an outcome varies based on the levels of other covariates known as effect modifiers. Modeling of these effect differences is important for etiological goals and for purposes of optimizing treatment. Structural nested mean models (SNMMs) are useful causal models for estimating the potentially heterogeneous effect of a time-varying exposure on the mean of an outcome in the presence of time-varying confounding. In longitudinal health studies, information on many demographic, behavioural, biological, and clinical covariates may be available, among which some might cause heterogeneous treatment effects. A data-driven approach for selecting the effect modifiers of an exposure may be necessary if these effect modifiers are \textit{a priori} unknown and need to be identified. Although variable selection techniques are available in the context of estimating conditional average treatment effects using marginal structural models, or in the context of estimating optimal dynamic treatment regimens, all of these methods consider an outcome measured at a single point in time. In the context of an SNMM for repeated outcomes, we propose a doubly robust penalized G-estimator for the causal effect of a time-varying exposure with a simultaneous selection of effect modifiers and prove the oracle property of our estimator. We conduct a simulation study to evaluate the performance of the proposed estimator in finite samples and for verification of its double-robustness property. Our work is motivated by a study of hemodiafiltration for treating patients with end-stage renal disease at the Centre Hospitalier de l'Universit\'e de Montr\'eal.