The morphology of nanostructured materials exhibiting a polydisperse porous space, such as aerogels, is very open porous and fine grained. Therefore, a simulation of the deformation of a large aerogel structure resolving the nanostructure would be extremely expensive. Thus, multi-scale or homogenization approaches have to be considered. Here, a computational scale bridging approach based on the FE$^2$ method is suggested, where the macroscopic scale is discretized using finite elements while the microstructure of the open-porous material is resolved as a network of Euler-Bernoulli beams. Here, the beam frame based RVEs (representative volume elements) have pores whose size distribution follows the measured values for a specific material. This is a well-known approach to model aerogel structures. For the computational homogenization, an approach to average the first Piola-Kirchhoff stresses in a beam frame by neglecting rotational moments is suggested. To further overcome the computationally most expensive part in the homogenization method, that is, solving the RVEs and averaging their stress fields, a surrogate model is introduced based on neural networks. The networks input is the localized deformation gradient on the macroscopic scale and its output is the averaged stress for the specific material. It is trained on data generated by the beam frame based approach. The effiency and robustness of both homogenization approaches is shown numerically, the approximation properties of the surrogate model is verified for different macroscopic problems and discretizations. Different (Quasi-)Newton solvers are considered on the macroscopic scale and compared with respect to their convergence properties.
The long-term dynamics of particles involved in an incompressible flow with a small viscosity ($\epsilon>0$) and slow chemical reactions, is depicted by a class of stochastic reaction-diffusion-advection (RDA) equations with a fast advection term of magnitude $1/\epsilon$. It has been shown in [7] the fast advection asymptotics of stochastic RDA equation in $\mathbb{R}^2$ can be characterized through a stochastic partial differential equation (SPDE) on the graph associated with certain Hamiltonian. To simulate such fast advection asymptotics, we introduce and study an asymptotic-preserving (AP) exponential Euler approximation for the multiscale stochastic RDA equation. There are three key ingredients in proving asymptotic-preserving property of the proposed approximation. First, a strong error estimate, which depends on $1/\epsilon$ linearly, is obtained via a variational argument. Second, we prove the consistency of exponential Euler approximations on the fast advection asymptotics between the original problem and the SPDE on graph. Last, a graph weighted space is introduced to quantify the approximation error for SPDE on graph, which avoids the possible singularity near the vertices. Numerical experiments are carried out to support the theoretical results.
Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in peptides with canonical amino acids. This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids. A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. This study establishes a benchmark dataset of paired AMPs in Staphylococcus aureus from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language ESM2 model demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient=0.50 for the regression task of the MIC values on the benchmark dataset. Source code and additional resources are available at //www.healthinformaticslab.org/supp/ or //github.com/Kewei2023/AMPCliff-generation.
Despite the widespread applications of machine learning force field (MLFF) on solids and small molecules, there is a notable gap in applying MLFF to complex liquid electrolytes. In this work, we introduce BAMBOO (ByteDance AI Molecular Simulation Booster), a novel framework for molecular dynamics (MD) simulations, with a demonstration of its capabilities in the context of liquid electrolytes for lithium batteries. We design a physics-inspired graph equivariant transformer architecture as the backbone of BAMBOO to learn from quantum mechanical simulations. Additionally, we pioneer an ensemble knowledge distillation approach and apply it on MLFFs to improve the stability of MD simulations. Finally, we propose the density alignment algorithm to align BAMBOO with experimental measurements. BAMBOO demonstrates state-of-the-art accuracy in predicting key electrolyte properties such as density, viscosity, and ionic conductivity across various solvents and salt combinations. Our current model, trained on more than 15 chemical species, achieves the average density error of 0.01 g/cm^3 on various compositions compared with experimental data. Moreover, our model demonstrates transferability to molecules not included in the quantum mechanical dataset. We envision this work as paving the way to a ''universal MLFF'' capable of simulating properties of common organic liquids.
Complex mechanical systems often exhibit strongly nonlinear behavior due to the presence of nonlinearities in the energy dissipation mechanisms, material constitutive relationships, or geometric/connectivity mechanics. Numerical modeling of these systems leads to nonlinear full-order models that possess an underlying Lagrangian structure. This work proposes a Lagrangian operator inference method enhanced with structure-preserving machine learning to learn nonlinear reduced-order models (ROMs) of nonlinear mechanical systems. This two-step approach first learns the best-fit linear Lagrangian ROM via Lagrangian operator inference and then presents a structure-preserving machine learning method to learn nonlinearities in the reduced space. The proposed approach can learn a structure-preserving nonlinear ROM purely from data, unlike the existing operator inference approaches that require knowledge about the mathematical form of nonlinear terms. From a machine learning perspective, it accelerates the training of the structure-preserving neural network by providing an informed prior, and it reduces the computational cost of the network training by operating on the reduced space. The method is first demonstrated on two simulated examples: a conservative nonlinear rod model and a two-dimensional nonlinear membrane with nonlinear internal damping. Finally, the method is demonstrated on an experimental dataset consisting of digital image correlation measurements taken from a lap-joint beam structure from which a predictive model is learned that captures amplitude-dependent frequency and damping characteristics accurately. The numerical results demonstrate that the proposed approach yields generalizable nonlinear ROMs that exhibit bounded energy error, capture the nonlinear characteristics reliably, and provide accurate long-time predictions outside the training data regime.
A grid-overlay finite difference method is proposed for the numerical approximation of the fractional Laplacian on arbitrary bounded domains. The method uses an unstructured simplicial mesh and an overlay uniform grid for the underlying domain and constructs the approximation based on a uniform-grid finite difference approximation and a data transfer from the unstructured mesh to the uniform grid. The method takes full advantages of both uniform-grid finite difference approximation in efficient matrix-vector multiplication via the fast Fourier transform and unstructured meshes for complex geometries and mesh adaptation. It is shown that its stiffness matrix is similar to a symmetric and positive definite matrix and thus invertible if the data transfer has full column rank and positive column sums. Piecewise linear interpolation is studied as a special example for the data transfer. It is proved that the full column rank and positive column sums of linear interpolation is guaranteed if the spacing of the uniform grid is smaller than or equal to a positive bound proportional to the minimum element height of the unstructured mesh. Moreover, a sparse preconditioner is proposed for the iterative solution of the resulting linear system for the homogeneous Dirichlet problem of the fractional Laplacian. Numerical examples demonstrate that the new method has similar convergence behavior as existing finite difference and finite element methods and that the sparse preconditioning is effective. Furthermore, the new method can readily be incorporated with existing mesh adaptation strategies. Numerical results obtained by combining with the so-called MMPDE moving mesh method are also presented.
We propose and analyse numerical schemes for a system of quasilinear, degenerate evolution equations modelling biofilm growth as well as other processes such as flow through porous media and the spreading of wildfires. The first equation in the system is parabolic and exhibits degenerate and singular diffusion, while the second is either uniformly parabolic or an ordinary differential equation. First, we introduce a semi-implicit time discretisation that has the benefit of decoupling the equations. We prove the positivity, boundedness, and convergence of the time-discrete solutions to the time-continuous solution. Then, we introduce an iterative linearisation scheme to solve the resulting nonlinear time-discrete problems. Under weak assumptions on the time-step size, we prove that the scheme converges irrespective of the space discretisation and mesh. Moreover, if the problem is non-degenerate, the convergence becomes faster as the time-step size decreases. Finally, employing the finite element method for the spatial discretisation, we study the behaviour of the scheme, and compare its performance to other commonly used schemes. These tests confirm that the proposed scheme is robust and fast.
Domain decomposition (DD) methods are a natural way to take advantage of parallel computers when solving large scale linear systems. Their scalability depends on the design of the coarse space used in the two-level method. The analysis of adaptive coarse spaces we present here is quite general since it applies to symmetric and non symmetric problems, to symmetric preconditioners such the additive Schwarz method (ASM) and to the non-symmetric preconditioner restricted additive Schwarz (RAS), as well as to exact or inexact subdomain solves. The coarse space is built by solving generalized eigenvalues in the subdomains and applying a well-chosen operator to the selected eigenvectors.
In many communication contexts, the capabilities of the involved actors cannot be known beforehand, whether it is a cell, a plant, an insect, or even a life form unknown to Earth. Regardless of the recipient, the message space and time scale could be too fast, too slow, too large, or too small and may never be decoded. Therefore, it pays to devise a way to encode messages agnostic of space and time scales. We propose the use of fractal functions as self-executable infinite-frequency carriers for sending messages, given their properties of structural self-similarity and scale invariance. We call it `fractal messaging'. Starting from a spatial embedding, we introduce a framework for a space-time scale-free messaging approach to this challenge. When considering a space and time-agnostic framework for message transmission, it would be interesting to encode a message such that it could be decoded at several spatio-temporal scales. Hence, the core idea of the framework proposed herein is to encode a binary message as waves along infinitely many frequencies (in power-like distributions) and amplitudes, transmit such a message, and then decode and reproduce it. To do so, the components of the Weierstrass function, a known fractal, are used as carriers of the message. Each component will have its amplitude modulated to embed the binary stream, allowing for a space-time-agnostic approach to messaging.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.