Information theory, which describes the transmission of signals in the presence of noise, has enabled the development of reliable communication systems that underlie the modern world. Imaging systems can also be viewed as a form of communication, in which information about the object is "transmitted" through images. However, the application of information theory to imaging systems has been limited by the challenges of accounting for their physical constraints. Here, we introduce a framework that addresses these limitations by modeling the probabilistic relationship between objects and their measurements. Using this framework, we develop a method to estimate information using only a dataset of noisy measurements, without making any assumptions about the image formation process. We demonstrate that these estimates comprehensively quantify measurement quality across a diverse range of imaging systems and applications. Furthermore, we introduce Information-Driven Encoder Analysis Learning (IDEAL), a technique to optimize the design of imaging hardware for maximum information capture. This work provides new insights into the fundamental performance limits of imaging systems and offers powerful new tools for their analysis and design.
Understanding causal relationships in dynamic systems is essential for numerous scientific fields, including epidemiology, economics, and biology. While causal inference methods have been extensively studied, they often rely on fully specified causal graphs, which may not always be available or practical in complex dynamic systems. Partially specified causal graphs, such as summary causal graphs (SCGs), provide a simplified representation of causal relationships, omitting temporal information and focusing on high-level causal structures. This simplification introduces new challenges concerning the types of queries of interest: macro queries, which involve relationships between clusters represented as vertices in the graph, and micro queries, which pertain to relationships between variables that are not directly visible through the vertices of the graph. In this paper, we first clearly distinguish between macro conditional independencies and micro conditional independencies and between macro total effects and micro total effects. Then, we demonstrate the soundness and completeness of the d-separation to identify macro conditional independencies in SCGs. Furthermore, we establish that the do-calculus is sound and complete for identifying macro total effects in SCGs. Conversely, we also show through various examples that these results do not hold when considering micro conditional independencies and micro total effects.
Despite the wide usage of parametric point processes in theory and applications, a sound goodness-of-fit procedure to test whether a given parametric model is appropriate for data coming from a self-exciting point processes has been missing in the literature. In this work, we establish a bootstrap-based goodness-of-fit test which empirically works for all kinds of self-exciting point processes (and even beyond). In an infill-asymptotic setting we also prove its asymptotic consistency, albeit only in the particular case that the underlying point process is inhomogeneous Poisson.
Variance reduction for causal inference in the presence of network interference is often achieved through either outcome modeling, which is typically analyzed under unit-randomized Bernoulli designs, or clustered experimental designs, which are typically analyzed without strong parametric assumptions. In this work, we study the intersection of these two approaches and consider the problem of estimation in low-order outcome models using data from a general experimental design. Our contributions are threefold. First, we present an estimator of the total treatment effect (also called the global average treatment effect) in a low-degree outcome model when the data are collected under general experimental designs, generalizing previous results for Bernoulli designs. We refer to this estimator as the pseudoinverse estimator and give bounds on its bias and variance in terms of properties of the experimental design. Second, we evaluate these bounds for the case of cluster randomized designs with both Bernoulli and complete randomization. For clustered Bernoulli randomization, we find that our estimator is always unbiased and that its variance scales like the smaller of the variance obtained from a low-order assumption and the variance obtained from cluster randomization, showing that combining these variance reduction strategies is preferable to using either individually. For clustered complete randomization, we find a notable bias-variance trade-off mediated by specific features of the clustering. Third, when choosing a clustered experimental design, our bounds can be used to select a clustering from a set of candidate clusterings. Across a range of graphs and clustering algorithms, we show that our method consistently selects clusterings that perform well on a range of response models, suggesting that our bounds are useful to practitioners.
Credible causal effect estimation requires treated subjects and controls to be otherwise similar. In observational settings, such as analysis of electronic health records, this is not guaranteed. Investigators must balance background variables so they are similar in treated and control groups. Common approaches include matching (grouping individuals into small homogeneous sets) or weighting (upweighting or downweighting individuals) to create similar profiles. However, creating identical distributions may be impossible if many variables are measured, and not all variables are of equal importance to the outcome. The joint variable importance plot (jointVIP) package to guides decisions about which variables to prioritize for adjustment by quantifying and visualizing each variable's relationship to both treatment and outcome.
We present a Bayesian method for multivariate changepoint detection that allows for simultaneous inference on the location of a changepoint and the coefficients of a logistic regression model for distinguishing pre-changepoint data from post-changepoint data. In contrast to many methods for multivariate changepoint detection, the proposed method is applicable to data of mixed type and avoids strict assumptions regarding the distribution of the data and the nature of the change. The regression coefficients provide an interpretable description of a potentially complex change. For posterior inference, the model admits a simple Gibbs sampling algorithm based on P\'olya-gamma data augmentation. We establish conditions under which the proposed method is guaranteed to recover the true underlying changepoint. As a testing ground for our method, we consider the problem of detecting topological changes in time series of images. We demonstrate that our proposed method $\mathtt{bclr}$, combined with a topological feature embedding, performs well on both simulated and real image data. The method also successfully recovers the location and nature of changes in more traditional changepoint tasks.
In the literature on spatial point processes, there is an emerging challenge in studying marked point processes with points being labelled by functions. In this paper, we focus on point processes living on linear networks and, from distinct points of view, propose several marked summary characteristics that are of great use in studying the average association and dispersion of the function-valued marks. Through a simulation study, we evaluate the performance of our proposed marked summary characteristics, both when marks are independent and when some sort of spatial dependence is evident among them. Finally, we employ our proposed mark summary characteristics to study the spatial structure of urban cycling profiles in Vancouver, Canada.
Analyzing longitudinal data in health studies is challenging due to sparse and error-prone measurements, strong within-individual correlation, missing data and various trajectory shapes. While mixed-effect models (MM) effectively address these challenges, they remain parametric models and may incur computational costs. In contrast, Functional Principal Component Analysis (FPCA) is a non-parametric approach developed for regular and dense functional data that flexibly describes temporal trajectories at a potentially lower computational cost. This paper presents an empirical simulation study evaluating the behaviour of FPCA with sparse and error-prone repeated measures and its robustness under different missing data schemes in comparison with MM. The results show that FPCA is well-suited in the presence of missing at random data caused by dropout, except in scenarios involving most frequent and systematic dropout. Like MM, FPCA fails under missing not at random mechanism. The FPCA was applied to describe the trajectories of four cognitive functions before clinical dementia and contrast them with those of matched controls in a case-control study nested in a population-based aging cohort. The average cognitive declines of future dementia cases showed a sudden divergence from those of their matched controls with a sharp acceleration 5 to 2.5 years prior to diagnosis.
A new approach based on censoring and moment criterion is introduced for parameter estimation of count distributions when the probability generating function is available even though a closed form of the probability mass function and/or finite moments do not exist.
A key requirement for the success of supervised deep learning is a large labeled dataset - a condition that is difficult to meet in medical image analysis. Self-supervised learning (SSL) can help in this regard by providing a strategy to pre-train a neural network with unlabeled data, followed by fine-tuning for a downstream task with limited annotations. Contrastive learning, a particular variant of SSL, is a powerful technique for learning image-level representations. In this work, we propose strategies for extending the contrastive learning framework for segmentation of volumetric medical images in the semi-supervised setting with limited annotations, by leveraging domain-specific and problem-specific cues. Specifically, we propose (1) novel contrasting strategies that leverage structural similarity across volumetric medical images (domain-specific cue) and (2) a local version of the contrastive loss to learn distinctive representations of local regions that are useful for per-pixel segmentation (problem-specific cue). We carry out an extensive evaluation on three Magnetic Resonance Imaging (MRI) datasets. In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques. When combined with a simple data augmentation technique, the proposed method reaches within 8% of benchmark performance using only two labeled MRI volumes for training, corresponding to only 4% (for ACDC) of the training data used to train the benchmark.
Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.