Medical studies for chronic disease are often interested in the relation between longitudinal risk factor profiles and individuals' later life disease outcomes. These profiles may typically be subject to intermediate structural changes due to treatment or environmental influences. Analysis of such studies may be handled by the joint model framework. However, current joint modeling does not consider structural changes in the residual variability of the risk profile nor consider the influence of subject-specific residual variability on the time-to-event outcome. In the present paper, we extend the joint model framework to address these two heterogeneous intra-individual variabilities. A Bayesian approach is used to estimate the unknown parameters and simulation studies are conducted to investigate the performance of the method. The proposed joint model is applied to the Framingham Heart Study to investigate the influence of anti-hypertensive medication on the systolic blood pressure variability together with its effect on the risk of developing cardiovascular disease. We show that anti-hypertensive medication is associated with elevated systolic blood pressure variability and increased variability elevates risk of developing cardiovascular disease.
Peptides are formed by the dehydration condensation of multiple amino acids. The primary structure of a peptide can be represented either as an amino acid sequence or as a molecular graph consisting of atoms and chemical bonds. Previous studies have indicated that deep learning routes specific to sequential and graphical peptide forms exhibit comparable performance on downstream tasks. Despite the fact that these models learn representations of the same modality of peptides, we find that they explain their predictions differently. Considering sequential and graphical models as two experts making inferences from different perspectives, we work on fusing expert knowledge to enrich the learned representations for improving the discriminative performance. To achieve this, we propose a peptide co-modeling method, RepCon, which employs a contrastive learning-based framework to enhance the mutual information of representations from decoupled sequential and graphical end-to-end models. It considers representations from the sequential encoder and the graphical encoder for the same peptide sample as a positive pair and learns to enhance the consistency of representations between positive sample pairs and to repel representations between negative pairs. Empirical studies of RepCon and other co-modeling methods are conducted on open-source discriminative datasets, including aggregation propensity, retention time, antimicrobial peptide prediction, and family classification from Peptide Database. Our results demonstrate the superiority of the co-modeling approach over independent modeling, as well as the superiority of RepCon over other methods under the co-modeling framework. In addition, the attribution on RepCon further corroborates the validity of the approach at the level of model explanation.
Peptides are formed by the dehydration condensation of multiple amino acids. The primary structure of a peptide can be represented either as an amino acid sequence or as a molecular graph consisting of atoms and chemical bonds. Previous studies have indicated that deep learning routes specific to sequential and graphical peptide forms exhibit comparable performance on downstream tasks. Despite the fact that these models learn representations of the same modality of peptides, we find that they explain their predictions differently. Considering sequential and graphical models as two experts making inferences from different perspectives, we work on fusing expert knowledge to enrich the learned representations for improving the discriminative performance. To achieve this, we propose a peptide co-modeling method, RepCon, which employs a contrastive learning-based framework to enhance the mutual information of representations from decoupled sequential and graphical end-to-end models. It considers representations from the sequential encoder and the graphical encoder for the same peptide sample as a positive pair and learns to enhance the consistency of representations between positive sample pairs and to repel representations between negative pairs. Empirical studies of RepCon and other co-modeling methods are conducted on open-source discriminative datasets, including aggregation propensity, retention time, antimicrobial peptide prediction, and family classification from Peptide Database. Our results demonstrate the superiority of the co-modeling approach over independent modeling, as well as the superiority of RepCon over other methods under the co-modeling framework. In addition, the attribution on RepCon further corroborates the validity of the approach at the level of model explanation.
NVIDIA researchers have pioneered an explicit method, position-based dynamics (PBD), for simulating systems with contact forces, gaining widespread use in computer graphics and animation. While the method yields visually compelling real-time simulations with surprising numerical stability, its scientific validity has been questioned due to a lack of rigorous analysis. In this paper, we introduce a new mathematical convergence analysis specifically tailored for PBD applied to first-order dynamics. Utilizing newly derived bounds for projections onto uniformly prox-regular sets, our proof extends classical compactness arguments. Our work paves the way for the reliable application of PBD in various scientific and engineering fields, including particle simulations with volume exclusion, agent-based models in mathematical biology or inequality-constrained gradient-flow models.
Sewer pipe network systems are an important part of civil infrastructure, and in order to find a good trade-off between maintenance costs and system performance, reliable sewer pipe degradation models are essential. In this paper, we present a large-scale case study in the city of Breda in the Netherlands. Our dataset has information on sewer pipes built since the 1920s and contains information on different covariates. We also have several types of damage, but we focus our attention on infiltrations, surface damage, and cracks. Each damage has an associated severity index ranging from 1 to 5. To account for the characteristics of sewer pipes, we defined 6 cohorts of interest. Two types of discrete-time Markov chains (DTMC), which we called Chain `Multi' and `Single' (where Chain `Multi' contains additional transitions compared to Chain `Single'), are commonly used to model sewer pipe degradation at the pipeline level, and we want to evaluate which suits better our case study. To calibrate the DTMCs, we define an optimization process using Sequential Least-Squares Programming to find the DTMC parameter that best minimizes the root mean weighted square error. Our results show that for our case study, there is no substantial difference between Chain `Multi' and `Single', but the latter has fewer parameters and can be easily trained. Our DTMCs are useful to compare the cohorts via the expected values, e.g., concrete pipes carrying mixed and waste content reach severe levels of surface damage more quickly compared to concrete pipes carrying rainwater, which is a phenomenon typically identified in practice.
Many natural systems are observed as point patterns in time, space, or space and time. Examples include plant and cellular systems, animal colonies, earthquakes, and wildfires. In practice the locations of the points are not always observed correctly. However, in the point process literature, there has been relatively scant attention paid to the issue of errors in the location of points. In this paper, we discuss how the observed point pattern may deviate from the actual point pattern and review methods and models that exist to handle such deviations. The discussion is supplemented with several scientific illustrations.
This article is concerned with a regularity analysis of parametric operator equations with a perspective on uncertainty quantification. We study the regularity of mappings between Banach spaces near branches of isolated solutions that are implicitly defined by a residual equation. Under $s$-Gevrey assumptions on on the residual equation, we establish $s$-Gevrey bounds on the Fr\'echet derivatives of the local data-to-solution mapping. This abstract framework is illustrated in a proof of regularity bounds for a semilinear elliptic partial differential equation with parametric and random field input.
One of the challenges of studying common neurological disorders is disease heterogeneity including differences in causes, neuroimaging characteristics, comorbidities, or genetic variation. Normative modelling has become a popular method for studying such cohorts where the 'normal' behaviour of a physiological system is modelled and can be used at subject level to detect deviations relating to disease pathology. For many heterogeneous diseases, we expect to observe abnormalities across a range of neuroimaging and biological variables. However, thus far, normative models have largely been developed for studying a single imaging modality. We aim to develop a multi-modal normative modelling framework where abnormality is aggregated across variables of multiple modalities and is better able to detect deviations than uni-modal baselines. We propose two multi-modal VAE normative models to detect subject level deviations across T1 and DTI data. Our proposed models were better able to detect diseased individuals, capture disease severity, and correlate with patient cognition than baseline approaches. We also propose a multivariate latent deviation metric, measuring deviations from the joint latent space, which outperformed feature-based metrics.
Temporal analysis of products (TAP) reactors enable experiments that probe numerous kinetic processes within a single set of experimental data through variations in pulse intensity, delay, or temperature. Selecting additional TAP experiments often involves arbitrary selection of reaction conditions or the use of chemical intuition. To make experiment selection in TAP more robust, we explore the efficacy of model-based design of experiments (MBDoE) for precision in TAP reactor kinetic modeling. We successfully applied this approach to a case study of synthetic oxidative propane dehydrogenation (OPDH) that involves pulses of propane and oxygen. We found that experiments identified as optimal through the MBDoE for precision generally reduce parameter uncertainties to a higher degree than alternative experiments. The performance of MBDoE for model divergence was also explored for OPDH, with the relevant active sites (catalyst structure) being unknown. An experiment that maximized the divergence between the three proposed mechanisms was identified and led to clear mechanism discrimination. However, re-optimization of kinetic parameters eliminated the ability to discriminate. The findings yield insight into the prospects and limitations of MBDoE for TAP and transient kinetic experiments.
We investigate a class of parametric elliptic semilinear partial differential equations of second order with homogeneous essential boundary conditions, where the coefficients and the right-hand side (and hence the solution) may depend on a parameter. This model can be seen as a reaction-diffusion problem with a polynomial nonlinearity in the reaction term. The efficiency of various numerical approximations across the entire parameter space is closely related to the regularity of the solution with respect to the parameter. We show that if the coefficients and the right-hand side are analytic or Gevrey class regular with respect to the parameter, the same type of parametric regularity is valid for the solution. The key ingredient of the proof is the combination of the alternative-to-factorial technique from our previous work [1] with a novel argument for the treatment of the power-type nonlinearity in the reaction term. As an application of this abstract result, we obtain rigorous convergence estimates for numerical integration of semilinear reaction-diffusion problems with random coefficients using Gaussian and Quasi-Monte Carlo quadrature. Our theoretical findings are confirmed in numerical experiments.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.