In this work we begin a theoretical and numerical investigation on the spectra of evolution operators of neutral renewal equations, with the stability of equilibria and periodic orbits in mind. We start from the simplest form of linear periodic equation with one discrete delay and fully characterize the spectrum of its monodromy operator. We perform numerical experiments discretizing the evolution operators via pseudospectral collocation, confirming the theoretical results and giving perspectives on the generalization to systems and to multiple delays. Although we do not attempt to perform a rigorous numerical analysis of the method, we give some considerations on a possible approach to the problem.
Many stochastic processes in the physical and biological sciences can be modelled using Brownian dynamics with multiplicative noise. However, numerical integrators for these processes can lose accuracy or even fail to converge when the diffusion term is configuration-dependent. One remedy is to construct a transform to a constant-diffusion process and sample the transformed process instead. In this work, we explain how coordinate-based and time-rescaling-based transforms can be used either individually or in combination to map a general class of variable-diffusion Brownian motion processes into constant-diffusion ones. The transforms are invertible, thus allowing recovery of the original dynamics. We motivate our methodology using examples in one dimension before then considering multivariate diffusion processes. We illustrate the benefits of the transforms through numerical simulations, demonstrating how the right combination of integrator and transform can improve computational efficiency and the order of convergence to the invariant distribution. Notably, the transforms that we derive are applicable to a class of multibody, anisotropic Stokes-Einstein diffusion that has applications in biophysical modelling.
We present an exponentially convergent numerical method to approximate the solution of the Cauchy problem for the inhomogeneous fractional differential equation with an unbounded operator coefficient and Caputo fractional derivative in time. The numerical method is based on the newly obtained solution formula that consolidates the mild solution representations of sub-parabolic, parabolic and sub-hyperbolic equations with sectorial operator coefficient $A$ and non-zero initial data. The involved integral operators are approximated using the sinc-quadrature formulas that are tailored to the spectral parameters of $A$, fractional order $\alpha$ and the smoothness of the first initial condition, as well as to the properties of the equation's right-hand side $f(t)$. The resulting method possesses exponential convergence for positive sectorial $A$, any finite $t$, including $t = 0$ and the whole range $\alpha \in (0,2)$. It is suitable for a practically important case, when no knowledge of $f(t)$ is available outside the considered interval $t \in [0, T]$. The algorithm of the method is capable of multi-level parallelism. We provide numerical examples that confirm the theoretical error estimates.
Graph neural networks (GNNs) have become increasingly popular for classification tasks on graph-structured data. Yet, the interplay between graph topology and feature evolution in GNNs is not well understood. In this paper, we focus on node-wise classification, illustrated with community detection on stochastic block model graphs, and explore the feature evolution through the lens of the "Neural Collapse" (NC) phenomenon. When training instance-wise deep classifiers (e.g. for image classification) beyond the zero training error point, NC demonstrates a reduction in the deepest features' within-class variability and an increased alignment of their class means to certain symmetric structures. We start with an empirical study that shows that a decrease in within-class variability is also prevalent in the node-wise classification setting, however, not to the extent observed in the instance-wise case. Then, we theoretically study this distinction. Specifically, we show that even an "optimistic" mathematical model requires that the graphs obey a strict structural condition in order to possess a minimizer with exact collapse. Interestingly, this condition is viable also for heterophilic graphs and relates to recent empirical studies on settings with improved GNNs' generalization. Furthermore, by studying the gradient dynamics of the theoretical model, we provide reasoning for the partial collapse observed empirically. Finally, we present a study on the evolution of within- and between-class feature variability across layers of a well-trained GNN and contrast the behavior with spectral methods.
The goal of this work is to study waves interacting with partially immersed objects allowed to move freely in the vertical direction, and in a regime in which the propagation of the waves is described by the one dimensional Boussinesq-Abbott system. The problem can be reduced to a transmission problem for this Boussinesq system, in which the transmission conditions between the components of the domain at the left and at the right of the object are determined through the resolution of coupled forced ODEs in time satisfied by the vertical displacement of the object and the average discharge in the portion of the fluid located under the object. We propose a new extended formulation in which these ODEs are complemented by two other forced ODEs satisfied by the trace of the surface elevation at the contact points. The interest of this new extended formulation is that the forcing terms are easy to compute numerically and that the surface elevation at the contact points is furnished for free. Based on this formulation, we propose a second order scheme that involves a generalization of the MacCormack scheme with nonlocal flux and a source term, which is coupled to a second order Heun scheme for the ODEs. In order to validate this scheme, several explicit solutions for this wave-structure interaction problem are derived and can serve as benchmark for future codes. As a byproduct, our method provides a second order scheme for the generation of waves at the entrance of the numerical domain for the Boussinesq-Abbott system.
Drug development is becoming more and more complex and resource-intensive. To reduce the costs and the time-to-market, the pharmaceutical industry employs cutting-edge automation solutions. Supportive robotics technologies, such as stationary and mobile manipulators, exist in various laboratory settings. However, they still lack the mobility and dexterity to navigate and operate in human-centered environments. We evaluate the feasibility of quadruped robots for the specific use case of remote inspection, utilizing the out-of-the-box capabilities of Boston Dynamics' Spot platform. We also provide an outlook on the newest technological advancements and the future applications these are anticipated to enable.
The volume function V(t) of a compact set S\in R^d is just the Lebesgue measure of the set of points within a distance to S not larger than t. According to some classical results in geometric measure theory, the volume function turns out to be a polynomial, at least in a finite interval, under a quite intuitive, easy to interpret, sufficient condition (called ``positive reach'') which can be seen as an extension of the notion of convexity. However, many other simple sets, not fulfilling the positive reach condition, have also a polynomial volume function. To our knowledge, there is no general, simple geometric description of such sets. Still, the polynomial character of $V(t)$ has some relevant consequences since the polynomial coefficients carry some useful geometric information. In particular, the constant term is the volume of S and the first order coefficient is the boundary measure (in Minkowski's sense). This paper is focused on sets whose volume function is polynomial on some interval starting at zero, whose length (that we call ``polynomial reach'') might be unknown. Our main goal is to approximate such polynomial reach by statistical means, using only a large enough random sample of points inside S. The practical motivation is simple: when the value of the polynomial reach , or rather a lower bound for it, is approximately known, the polynomial coefficients can be estimated from the sample points by using standard methods in polynomial approximation. As a result, we get a quite general method to estimate the volume and boundary measure of the set, relying only on an inner sample of points and not requiring the use any smoothing parameter. This paper explores the theoretical and practical aspects of this idea.
Graph Neural Networks (GNNs) have been successfully used in many problems involving graph-structured data, achieving state-of-the-art performance. GNNs typically employ a message-passing scheme, in which every node aggregates information from its neighbors using a permutation-invariant aggregation function. Standard well-examined choices such as the mean or sum aggregation functions have limited capabilities, as they are not able to capture interactions among neighbors. In this work, we formalize these interactions using an information-theoretic framework that notably includes synergistic information. Driven by this definition, we introduce the Graph Ordering Attention (GOAT) layer, a novel GNN component that captures interactions between nodes in a neighborhood. This is achieved by learning local node orderings via an attention mechanism and processing the ordered representations using a recurrent neural network aggregator. This design allows us to make use of a permutation-sensitive aggregator while maintaining the permutation-equivariance of the proposed GOAT layer. The GOAT model demonstrates its increased performance in modeling graph metrics that capture complex information, such as the betweenness centrality and the effective size of a node. In practical use-cases, its superior modeling capability is confirmed through its success in several real-world node classification benchmarks.
Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.