Irksome is a library based on the Unified Form Language (UFL) that enables automated generation of Runge--Kutta methods for time-stepping finite element spatial discretizations of partial differential equations (PDE). Allowing users to express semidiscrete forms of PDE, it generates UFL representations for the stage-coupled variational problems to be solved at each time step. The Firedrake package then generates efficient code for evaluating these variational problems and allows users a wide range of options to deploy efficient algebraic solvers in PETSc. In this paper, we describe several recent advances in Irksome. These include alternate formulations of the Runge--Kutta time-stepping methods and optimized support for diagonally implicit (DIRK) methods. Additionally, we present new and improved tools for building preconditioners for the resulting linear and linearized systems, demonstrating that these can lead to efficient approaches for solving fully implicit Runge-Kutta discretizations. The new features are demonstrated through a sequence of computational examples demonstrating the high-level interface and obtained solver performance.
Recent advancements in Spatial Transcriptomics (ST) technology have facilitated detailed gene expression analysis within tissue contexts. However, the high costs and methodological limitations of ST necessitate a more robust predictive model. In response, this paper introduces TRIPLEX, a novel deep learning framework designed to predict spatial gene expression from Whole Slide Images (WSIs). TRIPLEX uniquely harnesses multi-resolution features, capturing cellular morphology at individual spots, the local context around these spots, and the global tissue organization. By integrating these features through an effective fusion strategy, TRIPLEX achieves accurate gene expression prediction. Our comprehensive benchmark study, conducted on three public ST datasets and supplemented with Visium data from 10X Genomics, demonstrates that TRIPLEX outperforms current state-of-the-art models in Mean Squared Error (MSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC). The model's predictions align closely with ground truth gene expression profiles and tumor annotations, underscoring TRIPLEX's potential in advancing cancer diagnosis and treatment.
As one of the emerging challenges in Automated Machine Learning, the Hardware-aware Neural Architecture Search (HW-NAS) tasks can be treated as black-box multi-objective optimization problems (MOPs). An important application of HW-NAS is real-time semantic segmentation, which plays a pivotal role in autonomous driving scenarios. The HW-NAS for real-time semantic segmentation inherently needs to balance multiple optimization objectives, including model accuracy, inference speed, and hardware-specific considerations. Despite its importance, benchmarks have yet to be developed to frame such a challenging task as multi-objective optimization. To bridge the gap, we introduce a tailored streamline to transform the task of HW-NAS for real-time semantic segmentation into standard MOPs. Building upon the streamline, we present a benchmark test suite, CitySeg/MOP, comprising fifteen MOPs derived from the Cityscapes dataset. The CitySeg/MOP test suite is integrated into the EvoXBench platform to provide seamless interfaces with various programming languages (e.g., Python and MATLAB) for instant fitness evaluations. We comprehensively assessed the CitySeg/MOP test suite on various multi-objective evolutionary algorithms, showcasing its versatility and practicality. Source codes are available at //github.com/EMI-Group/evoxbench.
In this paper, we innovatively develop uniform/variable-time-step weighted and shifted BDF2 (WSBDF2) methods for the anisotropic Cahn-Hilliard (CH) model, combining the scalar auxiliary variable (SAV) approach with two types of stabilized techniques. Using the concept of $G$-stability, the uniform-time-step WSBDF2 method is theoretically proved to be energy-stable. Due to the inapplicability of the relevant G-stability properties, another technique is adopted in this work to demonstrate the energy stability of the variable-time-step WSBDF2 method. In addition, the two numerical schemes are all mass-conservative.Finally, numerous numerical simulations are presented to demonstrate the stability and accuracy of these schemes.
In (Dzanic, J. Comp. Phys., 508:113010, 2024), a limiting approach for high-order discontinuous Galerkin schemes was introduced which allowed for imposing constraints on the solution continuously (i.e., everywhere within the element). While exact for linear constraint functionals, this approach only imposed a sufficient (but not the minimum necessary) amount of limiting for nonlinear constraint functionals. This short note shows how this limiting approach can be extended to allow exactness for general nonlinear quasiconcave constraint functionals through a nonlinear limiting procedure, reducing unnecessary numerical dissipation. Some examples are shown for nonlinear pressure and entropy constraints in the compressible gas dynamics equations, where both analytic and iterative approaches are used.
This paper presents TimelinePTC, a web-based tool developed to improve the collection and analysis of Pathways to Care (PTC) data in first episode psychosis (FEP) research. Accurately measuring the duration of untreated psychosis (DUP) is essential for effective FEP treatment, requiring detailed understanding of the patient's journey to care. However, traditional PTC data collection methods, mainly manual and paper-based, are time-consuming and often fail to capture the full complexity of care pathways. TimelinePTC addresses these limitations by providing a digital platform for collaborative, real-time data entry and visualization, thereby enhancing data accuracy and collection efficiency. Initially created for the Specialized Treatment Early in Psychosis (STEP) program in New Haven, Connecticut, its design allows for straightforward adaptation to other healthcare contexts, facilitated by its open-source codebase. The tool significantly simplifies the data collection process, making it more efficient and user-friendly. It automates the conversion of collected data into a format ready for analysis, reducing manual transcription errors and saving time. By enabling more detailed and consistent data collection, TimelinePTC has the potential to improve healthcare access research, supporting the development of targeted interventions to reduce DUP and improve patient outcomes.
This work presents GAL{\AE}XI as a novel, energy-efficient flow solver for the simulation of compressible flows on unstructured meshes leveraging the parallel computing power of modern Graphics Processing Units (GPUs). GAL{\AE}XI implements the high-order Discontinuous Galerkin Spectral Element Method (DGSEM) using shock capturing with a finite-volume subcell approach to ensure the stability of the high-order scheme near shocks. This work provides details on the general code design, the parallelization strategy, and the implementation approach for the compute kernels with a focus on the element local mappings between volume and surface data due to the unstructured mesh. GAL{\AE}XI exhibits excellent strong scaling properties up to 1024 GPUs if each GPU is assigned a minimum of one million degrees of freedom degrees of freedom. To verify its implementation, a convergence study is performed that recovers the theoretical order of convergence of the implemented numerical schemes. Moreover, the solver is validated using both the incompressible and compressible formulation of the Taylor-Green-Vortex at a Mach number of 0.1 and 1.25, respectively. A mesh convergence study shows that the results converge to the high-fidelity reference solution and that the results match the original CPU implementation. Finally, GAL{\AE}XI is applied to a large-scale wall-resolved large eddy simulation of a linear cascade of the NASA Rotor 37. Here, the supersonic region and shocks at the leading edge are captured accurately and robustly by the implemented shock-capturing approach. It is demonstrated that GAL{\AE}XI requires less than half of the energy to carry out this simulation in comparison to the reference CPU implementation. This renders GAL{\AE}XI as a potent tool for accurate and efficient simulations of compressible flows in the realm of exascale computing and the associated new HPC architectures.
We introduce a framework rooted in a rate distortion problem for Markov chains, and show how a suite of commonly used Markov Chain Monte Carlo (MCMC) algorithms are specific instances within it, where the target stationary distribution is controlled by the distortion function. Our approach offers a unified variational view on the optimality of algorithms such as Metropolis-Hastings, Glauber dynamics, the swapping algorithm and Feynman-Kac path models. Along the way, we analyze factorizability and geometry of multivariate Markov chains. Specifically, we demonstrate that induced chains on factors of a product space can be regarded as information projections with respect to a particular divergence. This perspective yields Han--Shearer type inequalities for Markov chains as well as applications in the context of large deviations and mixing time comparison.
We propose a novel method (floZ), based on normalizing flows, for estimating the Bayesian evidence (and its numerical uncertainty) from a set of samples drawn from the unnormalized posterior distribution. We validate it on distributions whose evidence is known analytically, up to 15 parameter space dimensions, and compare with two state-of-the-art techniques for estimating the evidence: nested sampling (which computes the evidence as its main target) and a k-nearest-neighbors technique that produces evidence estimates from posterior samples. Provided representative samples from the target posterior are available, our method is more robust to posterior distributions with sharp features, especially in higher dimensions. It has wide applicability, e.g., to estimate the evidence from variational inference, Markov-chain Monte Carlo samples, or any other method that delivers samples from the unnormalized posterior density.
The Hierarchy Of Time-Surfaces (HOTS) algorithm, a neuromorphic approach for feature extraction from event data, presents promising capabilities but faces challenges in accuracy and compatibility with neuromorphic hardware. In this paper, we introduce Sup3r, a Semi-Supervised algorithm aimed at addressing these challenges. Sup3r enhances sparsity, stability, and separability in the HOTS networks. It enables end-to-end online training of HOTS networks replacing external classifiers, by leveraging semi-supervised learning. Sup3r learns class-informative patterns, mitigates confounding features, and reduces the number of processed events. Moreover, Sup3r facilitates continual and incremental learning, allowing adaptation to data distribution shifts and learning new tasks without forgetting. Preliminary results on N-MNIST demonstrate that Sup3r achieves comparable accuracy to similarly sized Artificial Neural Networks trained with back-propagation. This work showcases the potential of Sup3r to advance the capabilities of HOTS networks, offering a promising avenue for neuromorphic algorithms in real-world applications.
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.