We present a novel computational model for the dynamics of alveolar recruitment/derecruitment (RD), which reproduces the underlying characteristics typically observed in injured lungs. The basic idea is a pressure- and time-dependent variation of the stress-free reference volume in reduced dimensional viscoelastic elements representing the acinar tissue. We choose a variable reference volume triggered by critical opening and closing pressures in a time-dependent manner from a straightforward mechanical point of view. In the case of (partially and progressively) collapsing alveolar structures, the volume available for expansion during breathing reduces and vice versa, eventually enabling consideration of alveolar collapse and reopening in our model. We further introduce a method for patient-specific determination of the underlying critical parameters of the new alveolar RD dynamics when integrated into the tissue elements, referred to as terminal units, of a spatially resolved physics-based lung model that simulates the human respiratory system in an anatomically correct manner. Relevant patient-specific parameters of the terminal units are herein determined based on medical image data and the macromechanical behavior of the lung during artificial ventilation. We test the whole modeling approach for a real-life scenario by applying it to the clinical data of a mechanically ventilated patient. The generated lung model is capable of reproducing clinical measurements such as tidal volume and pleural pressure during various ventilation maneuvers. We conclude that this new model is an important step toward personalized treatment of ARDS patients by considering potentially harmful mechanisms - such as cyclic RD and overdistension - and might help in the development of relevant protective ventilation strategies to reduce ventilator-induced lung injury (VILI).
We systematically study a wide variety of generative models spanning semantically-diverse image datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metric strongly correlates with human evaluations. Comparing to 17 modern metrics for evaluating the overall performance, fidelity, diversity, rarity, and memorization of generative models, we find that the state-of-the-art perceptual realism of diffusion models as judged by humans is not reflected in commonly reported metrics such as FID. This discrepancy is not explained by diversity in generated samples, though one cause is over-reliance on Inception-V3. We address these flaws through a study of alternative self-supervised feature extractors, find that the semantic information encoded by individual networks strongly depends on their training procedure, and show that DINOv2-ViT-L/14 allows for much richer evaluation of generative models. Next, we investigate data memorization, and find that generative models do memorize training examples on simple, smaller datasets like CIFAR10, but not necessarily on more complex datasets like ImageNet. However, our experiments show that current metrics do not properly detect memorization: none in the literature is able to separate memorization from other phenomena such as underfitting or mode shrinkage. To facilitate further development of generative models and their evaluation we release all generated image datasets, human evaluation data, and a modular library to compute 17 common metrics for 9 different encoders at //github.com/layer6ai-labs/dgm-eval.
Student simulation presents a transformative approach to enhance learning outcomes, advance educational research, and ultimately shape the future of effective pedagogy. We explore the feasibility of using large language models (LLMs), a remarkable achievement in AI, to simulate student learning behaviors. Unlike conventional machine learning based prediction, we leverage LLMs to instantiate virtual students with specific demographics and uncover intricate correlations among learning experiences, course materials, understanding levels, and engagement. Our objective is not merely to predict learning outcomes but to replicate learning behaviors and patterns of real students. We validate this hypothesis through three experiments. The first experiment, based on a dataset of N = 145, simulates student learning outcomes from demographic data, revealing parallels with actual students concerning various demographic factors. The second experiment (N = 4524) results in increasingly realistic simulated behaviors with more assessment history for virtual students modelling. The third experiment (N = 27), incorporating prior knowledge and course interactions, indicates a strong link between virtual students' learning behaviors and fine-grained mappings from test questions, course materials, engagement and understanding levels. Collectively, these findings deepen our understanding of LLMs and demonstrate its viability for student simulation, empowering more adaptable curricula design to enhance inclusivity and educational effectiveness.
The rising popularity of deep learning (DL) methods and techniques has invigorated interest in the topic of SE4DL, the application of software engineering (SE) practices on deep learning software. Despite the novel engineering challenges brought on by the data-driven and non-deterministic paradigm of DL software, little work has been invested into developing AI-targeted SE tools. On the other hand, tools tackling more general engineering issues in DL are actively used and referred to under the umbrella term of ``MLOps tools''. Furthermore, the available literature supports the utility of conventional SE tooling in DL software development. Building upon previous MSR research on tool usage in open-source software works, we identify conventional and MLOps tools adopted in popular applied DL projects that use Python as the main programming language. About 70% of the GitHub repositories mined contained at least one conventional SE tool. Software configuration management tools are the most adopted, while the opposite applies to maintenance tools. Substantially fewer MLOps tools were in use, with only 9 tools out of a sample of 80 used in at least one repository. The majority of them were open-source rather than proprietary. One of these tools, TensorBoard, was found to be adopted in about half of the repositories in our study. Consequently, the use of conventional SE tooling demonstrates its relevance to DL software. Further research is recommended on the adoption of MLOps tooling by open-source projects, focusing on the relevance of particular tool types, the development of required tools, as well as ways to promote the use of already available tools.
For multi-scale problems, the conventional physics-informed neural networks (PINNs) face some challenges in obtaining available predictions. In this paper, based on PINNs, we propose a practical deep learning framework for multi-scale problems by reconstructing the loss function and associating it with special neural network architectures. New PINN methods derived from the improved PINN framework differ from the conventional PINN method mainly in two aspects. First, the new methods use a novel loss function by modifying the standard loss function through a (grouping) regularization strategy. The regularization strategy implements a different power operation on each loss term so that all loss terms composing the loss function are of approximately the same order of magnitude, which makes all loss terms be optimized synchronously during the optimization process. Second, for the multi-frequency or high-frequency problems, in addition to using the modified loss function, new methods upgrade the neural network architecture from the common fully-connected neural network to special network architectures such as the Fourier feature architecture, and the integrated architecture developed by us. The combination of the above two techniques leads to a significant improvement in the computational accuracy of multi-scale problems. Several challenging numerical examples demonstrate the effectiveness of the proposed methods. The proposed methods not only significantly outperform the conventional PINN method in terms of computational efficiency and computational accuracy, but also compare favorably with the state-of-the-art methods in the recent literature. The improved PINN framework facilitates better application of PINNs to multi-scale problems.
Stress prediction in porous materials and structures is challenging due to the high computational cost associated with direct numerical simulations. Convolutional Neural Network (CNN) based architectures have recently been proposed as surrogates to approximate and extrapolate the solution of such multiscale simulations. These methodologies are usually limited to 2D problems due to the high computational cost of 3D voxel based CNNs. We propose a novel geometric learning approach based on a Graph Neural Network (GNN) that efficiently deals with three-dimensional problems by performing convolutions over 2D surfaces only. Following our previous developments using pixel-based CNN, we train the GNN to automatically add local fine-scale stress corrections to an inexpensively computed coarse stress prediction in the porous structure of interest. Our method is Bayesian and generates densities of stress fields, from which credible intervals may be extracted. As a second scientific contribution, we propose to improve the extrapolation ability of our network by deploying a strategy of online physics-based corrections. Specifically, we condition the posterior predictions of our probabilistic predictions to satisfy partial equilibrium at the microscale, at the inference stage. This is done using an Ensemble Kalman algorithm, to ensure tractability of the Bayesian conditioning operation. We show that this innovative methodology allows us to alleviate the effect of undesirable biases observed in the outputs of the uncorrected GNN, and improves the accuracy of the predictions in general.
The joint analysis of multimodal neuroimaging data is critical in the field of brain research because it reveals complex interactive relationships between neurobiological structures and functions. In this study, we focus on investigating the effects of structural imaging (SI) features, including white matter micro-structure integrity (WMMI) and cortical thickness, on the whole brain functional connectome (FC) network. To achieve this goal, we propose a network-based vector-on-matrix regression model to characterize the FC-SI association patterns. We have developed a novel multi-level dense bipartite and clique subgraph extraction method to identify which subsets of spatially specific SI features intensively influence organized FC sub-networks. The proposed method can simultaneously identify highly correlated structural-connectomic association patterns and suppress false positive findings while handling millions of potential interactions. We apply our method to a multimodal neuroimaging dataset of 4,242 participants from the UK Biobank to evaluate the effects of whole-brain WMMI and cortical thickness on the resting-state FC. The results reveal that the WMMI on corticospinal tracts and inferior cerebellar peduncle significantly affect functional connections of sensorimotor, salience, and executive sub-networks with an average correlation of 0.81 (p<0.001).
The eigenvalue decomposition (EVD) of (a batch of) Hermitian matrices of order two has a role in many numerical algorithms, of which the one-sided Jacobi method for the singular value decomposition (SVD) is the prime example. In this paper the batched EVD is vectorized, with a vector-friendly data layout and the AVX-512 SIMD instructions of Intel CPUs, alongside other key components of a real and a complex OpenMP-parallel Jacobi-type SVD method, inspired by the sequential xGESVJ routines from LAPACK. These vectorized building blocks should be portable to other platforms that support similar vector operations. Unconditional numerical reproducibility is guaranteed for the batched EVD, sequential or threaded, and for the column transformations, that are, like the scaled dot-products, presently sequential but can be threaded if nested parallelism is desired. No avoidable overflow of the results can occur with the proposed EVD or the whole SVD. The measured accuracy of the proposed EVD often surpasses that of the xLAEV2 routines from LAPACK. While the batched EVD outperforms the matching sequence of xLAEV2 calls, speedup of the parallel SVD is modest but can be improved and is already beneficial with enough threads. Regardless of their number, the proposed SVD method gives identical results, but of somewhat lower accuracy than xGESVJ.
This work presents a comparative study to numerically compute impulse approximate controls for parabolic equations with various boundary conditions. Theoretical controllability results have been recently investigated using a logarithmic convexity estimate at a single time based on a Carleman commutator approach. We propose a numerical algorithm for computing the impulse controls with minimal $L^2$-norms by adapting a penalized Hilbert Uniqueness Method (HUM) combined with a Conjugate Gradient (CG) method. We consider static boundary conditions (Dirichlet and Neumann) and dynamic boundary conditions. Some numerical experiments based on our developed algorithm are given to validate and compare the theoretical impulse controllability results.
We consider certain large random matrices, called random inner-product kernel matrices, which are essentially given by a nonlinear function $f$ applied entrywise to a sample-covariance matrix, $f(X^TX)$, where $X \in \mathbb{R}^{d \times N}$ is random and normalized in such a way that $f$ typically has order-one arguments. We work in the polynomial regime, where $N \asymp d^\ell$ for some $\ell > 0$, not just the linear regime where $\ell = 1$. Earlier work by various authors showed that, when the columns of $X$ are either uniform on the sphere or standard Gaussian vectors, and when $\ell$ is an integer (the linear regime $\ell = 1$ is particularly well-studied), the bulk eigenvalues of such matrices behave in a simple way: They are asymptotically given by the free convolution of the semicircular and Mar\v{c}enko-Pastur distributions, with relative weights given by expanding $f$ in the Hermite basis. In this paper, we show that this phenomenon is universal, holding as soon as $X$ has i.i.d. entries with all finite moments. In the case of non-integer $\ell$, the Mar\v{c}enko-Pastur term disappears (its weight in the free convolution vanishes), and the spectrum is just semicircular.
In an era where scientific experiments can be very costly, multi-fidelity emulators provide a useful tool for cost-efficient predictive scientific computing. For scientific applications, the experimenter is often limited by a tight computational budget, and thus wishes to (i) maximize predictive power of the multi-fidelity emulator via a careful design of experiments, and (ii) ensure this model achieves a desired error tolerance with some notion of confidence. Existing design methods, however, do not jointly tackle objectives (i) and (ii). We propose a novel stacking design approach that addresses both goals. A multi-level reproducing kernel Hilbert space (RKHS) interpolator is first introduced to build the emulator, under which our stacking design provides a sequential approach for designing multi-fidelity runs such that a desired prediction error of $\epsilon > 0$ is met under regularity assumptions. We then prove a novel cost complexity theorem that, under this multi-level interpolator, establishes a bound on the computation cost (for training data simulation) needed to achieve a prediction bound of $\epsilon$. This result provides novel insights on conditions under which the proposed multi-fidelity approach improves upon a conventional RKHS interpolator which relies on a single fidelity level. Finally, we demonstrate the effectiveness of stacking designs in a suite of simulation experiments and an application to finite element analysis.