We applied physics-informed neural networks to solve the constitutive relations for nonlinear, path-dependent material behavior. As a result, the trained network not only satisfies all thermodynamic constraints but also instantly provides information about the current material state (i.e., free energy, stress, and the evolution of internal variables) under any given loading scenario without requiring initial data. One advantage of this work is that it bypasses the repetitive Newton iterations needed to solve nonlinear equations in complex material models. Additionally, strategies are provided to reduce the required order of derivative for obtaining the tangent operator. The trained model can be directly used in any finite element package (or other numerical methods) as a user-defined material model. However, challenges remain in the proper definition of collocation points and in integrating several non-equality constraints that become active or non-active simultaneously. We tested this methodology on rate-independent processes such as the classical von Mises plasticity model with a nonlinear hardening law, as well as local damage models for interface cracking behavior with a nonlinear softening law. In order to demonstrate the applicability of the methodology in handling complex path dependency in a three-dimensional (3D) scenario, we tested the approach using the equations governing a damage model for a three-dimensional interface model. Such models are frequently employed for intergranular fracture at grain boundaries. We have observed a perfect agreement between the results obtained through the proposed methodology and those obtained using the classical approach. Furthermore, the proposed approach requires significantly less effort in terms of implementation and computing time compared to the traditional methods.
Recalling the most relevant visual memories for localisation or understanding a priori the likely outcome of localisation effort against a particular visual memory is useful for efficient and robust visual navigation. Solutions to this problem should be divorced from performance appraisal against ground truth - as this is not available at run-time - and should ideally be based on generalisable environmental observations. For this, we propose applying the recently developed Visual DNA as a highly scalable tool for comparing datasets of images - in this work, sequences of map and live experiences. In the case of localisation, important dataset differences impacting performance are modes of appearance change, including weather, lighting, and season. Specifically, for any deep architecture which is used for place recognition by matching feature volumes at a particular layer, we use distribution measures to compare neuron-wise activation statistics between live images and multiple previously recorded past experiences, with a potentially large seasonal (winter/summer) or time of day (day/night) shift. We find that differences in these statistics correlate to performance when localising using a past experience with the same appearance gap. We validate our approach over the Nordland cross-season dataset as well as data from Oxford's University Parks with lighting and mild seasonal change, showing excellent ability of our system to rank actual localisation performance across candidate experiences.
The possibility of recognizing diverse aspects of human behavior and environmental context from passively captured data motivates its use for mental health assessment. In this paper, we analyze the contribution of different passively collected sensor data types (WiFi, GPS, Social interaction, Phone Log, Physical Activity, Audio, and Academic features) to predict daily selfreport stress and PHQ-9 depression score. First, we compute 125 mid-level features from the original raw data. These 125 features include groups of features from the different sensor data types. Then, we evaluate the contribution of each feature type by comparing the performance of Neural Network models trained with all features against Neural Network models trained with specific feature groups. Our results show that WiFi features (which encode mobility patterns) and Phone Log features (which encode information correlated with sleep patterns), provide significative information for stress and depression prediction.
In this article, an efficient numerical method for computing finite-horizon controllability Gramians in Cholesky-factored form is proposed. The method is applicable to general dense matrices of moderate size and produces a Cholesky factor of the Gramian without computing the full product. In contrast to other methods applicable to this task, the proposed method is a generalization of the scaling-and-squaring approach for approximating the matrix exponential. It exploits a similar doubling formula for the Gramian, and thereby keeps the required computational effort modest. Most importantly, a rigorous backward error analysis is provided, which guarantees that the approximation is accurate to the round-off error level in double precision. This accuracy is illustrated in practice on a large number of standard test examples. The method has been implemented in the Julia package FiniteHorizonGramians.jl, which is available online under the MIT license. Code for reproducing the experimental results is included in this package, as well as code for determining the optimal method parameters. The analysis can thus easily be adapted to a different finite-precision arithmetic.
We propose a neural network-based meta-learning method to efficiently solve partial differential equation (PDE) problems. The proposed method is designed to meta-learn how to solve a wide variety of PDE problems, and uses the knowledge for solving newly given PDE problems. We encode a PDE problem into a problem representation using neural networks, where governing equations are represented by coefficients of a polynomial function of partial derivatives, and boundary conditions are represented by a set of point-condition pairs. We use the problem representation as an input of a neural network for predicting solutions, which enables us to efficiently predict problem-specific solutions by the forwarding process of the neural network without updating model parameters. To train our model, we minimize the expected error when adapted to a PDE problem based on the physics-informed neural network framework, by which we can evaluate the error even when solutions are unknown. We demonstrate that our proposed method outperforms existing methods in predicting solutions of PDE problems.
Differential geometric approaches are ubiquitous in several fields of mathematics, physics and engineering, and their discretizations enable the development of network-based mathematical and computational frameworks, which are essential for large-scale data science. The Forman-Ricci curvature (FRC) - a statistical measure based on Riemannian geometry and designed for networks - is known for its high capacity for extracting geometric information from complex networks. However, extracting information from dense networks is still challenging due to the combinatorial explosion of high-order network structures. Motivated by this challenge we sought a set-theoretic representation theory for high-order network cells and FRC, as well as their associated concepts and properties, which together provide an alternative and efficient formulation for computing high-order FRC in complex networks. We provide a pseudo-code, a software implementation coined FastForman, as well as a benchmark comparison with alternative implementations. Crucially, our representation theory reveals previous computational bottlenecks and also accelerates the computation of FRC. As a consequence, our findings open new research possibilities in complex systems where higher-order geometric computations are required.
Physics-informed neural networks have emerged as a coherent framework for building predictive models that combine statistical patterns with domain knowledge. The underlying notion is to enrich the optimization loss function with known relationships to constrain the space of possible solutions. Hydrodynamic simulations are a core constituent of modern cosmology, while the required computations are both expensive and time-consuming. At the same time, the comparatively fast simulation of dark matter requires fewer resources, which has led to the emergence of machine learning algorithms for baryon inpainting as an active area of research; here, recreating the scatter found in hydrodynamic simulations is an ongoing challenge. This paper presents the first application of physics-informed neural networks to baryon inpainting by combining advances in neural network architectures with physical constraints, injecting theory on baryon conversion efficiency into the model loss function. We also introduce a punitive prediction comparison based on the Kullback-Leibler divergence, which enforces scatter reproduction. By simultaneously extracting the complete set of baryonic properties for the Simba suite of cosmological simulations, our results demonstrate improved accuracy of baryonic predictions based on dark matter halo properties, successful recovery of the fundamental metallicity relation, and retrieve scatter that traces the target simulation's distribution.
Rational function approximations provide a simple but flexible alternative to polynomial approximation, allowing one to capture complex non-linearities without oscillatory artifacts. However, there have been few attempts to use rational functions on noisy data due to the likelihood of creating spurious singularities. To avoid the creation of singularities, we use Bernstein polynomials and appropriate conditions on their coefficients to force the denominator to be strictly positive. While this reduces the range of rational polynomials that can be expressed, it keeps all the benefits of rational functions while maintaining the robustness of polynomial approximation in noisy data scenarios. Our numerical experiments on noisy data show that existing rational approximation methods continually produce spurious poles inside the approximation domain. This contrasts our method, which cannot create poles in the approximation domain and provides better fits than a polynomial approximation and even penalized splines on functions with multiple variables. Moreover, guaranteeing pole-free in an interval is critical for estimating non-constant coefficients when numerically solving differential equations using spectral methods. This provides a compact representation of the original differential equation, allowing numeric solvers to achieve high accuracy quickly, as seen in our experiments.
The proximal Galerkin finite element method is a high-order, low iteration complexity, nonlinear numerical method that preserves the geometric and algebraic structure of pointwise bound constraints in infinite-dimensional function spaces. This paper introduces the proximal Galerkin method and applies it to solve free boundary problems, enforce discrete maximum principles, and develop a scalable, mesh-independent algorithm for optimal design problems with pointwise bound constraints. This paper also provides a derivation of the latent variable proximal point (LVPP) algorithm, an unconditionally stable alternative to the interior point method. LVPP is an infinite-dimensional optimization algorithm that may be viewed as having an adaptive barrier function that is updated with a new informative prior at each (outer loop) optimization iteration. One of its main benefits is witnessed when analyzing the classical obstacle problem. Therein, we find that the original variational inequality can be replaced by a sequence of partial differential equations (PDEs) that are readily discretized and solved with, e.g., high-order finite elements. Throughout this work, we arrive at several unexpected contributions that may be of independent interest. These include (1) a semilinear PDE we refer to as the entropic Poisson equation; (2) an algebraic/geometric connection between high-order positivity-preserving discretizations and certain infinite-dimensional Lie groups; and (3) a gradient-based, bound-preserving algorithm for two-field density-based topology optimization. The complete latent variable proximal Galerkin methodology combines ideas from nonlinear programming, functional analysis, tropical algebra, and differential geometry and can potentially lead to new synergies among these areas as well as within variational and numerical analysis.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.