Species transport models typically combine partial differential equations (PDEs) with relations from hindered transport theory to quantify electromigrative, convective, and diffusive transport through complex nanoporous systems; however, these formulations are frequently substantial simplifications of the governing dynamics, leading to the poor generalization performance of PDE-based models. Given the growing interest in deep learning methods for the physical sciences, we develop a machine learning-based approach to characterize ion transport across nanoporous membranes. Our proposed framework centers around attention-enhanced neural differential equations that incorporate electroneutrality-based inductive biases to improve generalization performance relative to conventional PDE-based methods. In addition, we study the role of the attention mechanism in illuminating physically-meaningful ion-pairing relationships across diverse mixture compositions. Further, we investigate the importance of pre-training on simulated data from PDE-based models, as well as the performance benefits from hard vs. soft inductive biases. Our results indicate that physics-informed deep learning solutions can outperform their classical PDE-based counterparts and provide promising avenues for modelling complex transport phenomena across diverse applications.
State-of-the-art methods for Bayesian inference in state-space models are (a) conditional sequential Monte Carlo (CSMC) algorithms; (b) sophisticated 'classical' MCMC algorithms like MALA, or mGRAD from Titsias and Papaspiliopoulos (2018, arXiv:1610.09641v3 [stat.ML]). The former propose $N$ particles at each time step to exploit the model's 'decorrelation-over-time' property and thus scale favourably with the time horizon, $T$ , but break down if the dimension of the latent states, $D$, is large. The latter leverage gradient-/prior-informed local proposals to scale favourably with $D$ but exhibit sub-optimal scalability with $T$ due to a lack of model-structure exploitation. We introduce methods which combine the strengths of both approaches. The first, Particle-MALA, spreads $N$ particles locally around the current state using gradient information, thus extending MALA to $T > 1$ time steps and $N > 1$ proposals. The second, Particle-mGRAD, additionally incorporates (conditionally) Gaussian prior dynamics into the proposal, thus extending the mGRAD algorithm to $T > 1$ time steps and $N > 1$ proposals. We prove that Particle-mGRAD interpolates between CSMC and Particle-MALA, resolving the 'tuning problem' of choosing between CSMC (superior for highly informative prior dynamics) and Particle-MALA (superior for weakly informative prior dynamics). We similarly extend other 'classical' MCMC approaches like auxiliary MALA, aGRAD, and preconditioned Crank-Nicolson-Langevin (PCNL) to $T > 1$ time steps and $N > 1$ proposals. In experiments, for both highly and weakly informative prior dynamics, our methods substantially improve upon both CSMC and sophisticated 'classical' MCMC approaches.
A new sparse semiparametric model is proposed, which incorporates the influence of two functional random variables in a scalar response in a flexible and interpretable manner. One of the functional covariates is included through a single-index structure, while the other is included linearly through the high-dimensional vector formed by its discretised observations. For this model, two new algorithms are presented for selecting relevant variables in the linear part and estimating the model. Both procedures utilise the functional origin of linear covariates. Finite sample experiments demonstrated the scope of application of both algorithms: the first method is a fast algorithm that provides a solution (without loss in predictive ability) for the significant computational time required by standard variable selection methods for estimating this model, and the second algorithm completes the set of relevant linear covariates provided by the first, thus improving its predictive efficiency. Some asymptotic results theoretically support both procedures. A real data application demonstrated the applicability of the presented methodology from a predictive perspective in terms of the interpretability of outputs and low computational cost.
We present a manifold-based autoencoder method for learning nonlinear dynamics in time, notably partial differential equations (PDEs), in which the manifold latent space evolves according to Ricci flow. This can be accomplished by simulating Ricci flow in a physics-informed setting, and manifold quantities can be matched so that Ricci flow is empirically achieved. With our methodology, the manifold is learned as part of the training procedure, so ideal geometries may be discerned, while the evolution simultaneously induces a more accommodating latent representation over static methods. We present our method on a range of numerical experiments consisting of PDEs that encompass desirable characteristics such as periodicity and randomness, remarking error on in-distribution and extrapolation scenarios.
Random fields are ubiquitous mathematical structures in physics, with applications ranging from thermodynamics and statistical physics to quantum field theory and cosmology. Recent works on information geometry of Gaussian random fields proposed mathematical expressions for the components of the metric tensor of the underlying parametric space, allowing the computation of the Gaussian curvature in each point of the manifold that represents the space of all possible parameter values that define such mathematical model. A key result in the dynamics of these random fields concerns the curvature effect, a series of variations in the curvature that happens in the parametric space when there are significant increase/decrease in the inverse temperature parameter. In this paper, we propose a numerical algorithm for the computation of geodesic curves in the Gaussian random fields manifold by deriving the 27 Christoffel symbols of the metric required for the definition of the Euler-Lagrange equations. The fourth-order Runge-Kutta method is applied to solve the Euler-Lagrange equations using an iterative approach based in Markov Chain Monte Carlo simulation. Our results reveal that, when the system undergoes phase trasitions, the geodesic dispersion phenomenon emerges: the geodesic curve obtained by reversing the system of differential equations in time, diverges from the original geodesic curve, as we move from zero curvature configurations (Euclidean geometry) to negative curvature configurations (hyperbolic-like geometry), and vice-versa. This phenomenon suggest that, time irreversibility in random field dynamics can be a direct consequence of the geometry of the underlying parametric space.
Pre-trained on a large and diverse dataset, the segment anything model (SAM) is the first promptable foundation model in computer vision aiming at object segmentation tasks. In this work, we evaluate SAM for the task of nuclear instance segmentation performance with zero-shot learning and finetuning. We compare SAM with other representative methods in nuclear instance segmentation, especially in the context of model generalisability. To achieve automatic nuclear instance segmentation, we propose using a nuclei detection model to provide bounding boxes or central points of nu-clei as visual prompts for SAM in generating nuclear instance masks from histology images.
We present a novel combination of dynamic embedded topic models and change-point detection to explore diachronic change of lexical semantic modality in classical and early Christian Latin. We demonstrate several methods for finding and characterizing patterns in the output, and relating them to traditional scholarship in Comparative Literature and Classics. This simple approach to unsupervised models of semantic change can be applied to any suitable corpus, and we conclude with future directions and refinements aiming to allow noisier, less-curated materials to meet that threshold.
The problem of straggler mitigation in distributed matrix multiplication (DMM) is considered for a large number of worker nodes and a fixed small finite field. Polynomial codes and matdot codes are generalized by making use of algebraic function fields (i.e., algebraic functions over an algebraic curve) over a finite field. The construction of optimal solutions is translated to a combinatorial problem on the Weierstrass semigroups of the corresponding algebraic curves. Optimal or almost optimal solutions are provided. These have the same computational complexity per worker as classical polynomial and matdot codes, and their recovery thresholds are almost optimal in the asymptotic regime (growing number of workers and a fixed finite field).
The celebrated Kleene fixed point theorem is crucial in the mathematical modelling of recursive specifications in Denotational Semantics. In this paper we discuss whether the hypothesis of the aforementioned result can be weakened. An affirmative answer to the aforesaid inquiry is provided so that a characterization of those properties that a self-mapping must satisfy in order to guarantee that its set of fixed points is non-empty when no notion of completeness are assumed to be satisfied by the partially ordered set. Moreover, the case in which the partially ordered set is coming from a quasi-metric space is treated in depth. Finally, an application of the exposed theory is obtained. Concretely, a mathematical method to discuss the asymptotic complexity of those algorithms whose running time of computing fulfills a recurrence equation is presented. Moreover, the aforesaid method retrieves the fixed point based methods that appear in the literature for asymptotic complexity analysis of algorithms. However, our new method improves the aforesaid methods because it imposes fewer requirements than those that have been assumed in the literature and, in addition, it allows to state simultaneously upper and lower asymptotic bounds for the running time computing.
We study the sharp interface limit of the stochastic Cahn-Hilliard equation with cubic double-well potential and additive space-time white noise $\epsilon^{\sigma}\dot{W}$ where $\epsilon>0$ is an interfacial width parameter. We prove that, for sufficiently large scaling constant $\sigma >0$, the stochastic Cahn-Hilliard equation converges to the deterministic Mullins-Sekerka/Hele-Shaw problem for $\epsilon\rightarrow 0$. The convergence is shown in suitable fractional Sobolev norms as well as in the $L^p$-norm for $p\in (2, 4]$ in spatial dimension $d=2,3$. This generalizes the existing result for the space-time white noise to dimension $d=3$ and improves the existing results for smooth noise, which were so far limited to $p\in \left(2, \frac{2d+8}{d+2}\right]$ in spatial dimension $d=2,3$. As a byproduct of the analysis of the stochastic problem with space-time white noise, we identify minimal regularity requirements on the noise which allow convergence to the sharp interface limit in the $\mathbb{H}^1$-norm and also provide improved convergence estimates for the sharp interface limit of the deterministic problem.
The Sinkhorn algorithm is a numerical method for the solution of optimal transport problems. Here, I give a brief survey of this algorithm, with a strong emphasis on its geometric origin: it is natural to view it as a discretization, by standard methods, of a non-linear integral equation. In the appendix, I also provide a short summary of an early result of Beurling on product measures, directly related to the Sinkhorn algorithm.