Random fields are ubiquitous mathematical structures in physics, with applications ranging from thermodynamics and statistical physics to quantum field theory and cosmology. Recent works on information geometry of Gaussian random fields proposed mathematical expressions for the components of the metric tensor of the underlying parametric space, allowing the computation of the curvature in each point of the manifold. In this study, our hypothesis is that time irreversibility in Gaussian random fields dynamics is a direct consequence of intrinsic geometric properties (curvature) of their parametric space. In order to validate this hypothesis, we compute the components of the metric tensor and derive the twenty seven Christoffel symbols of the metric to define the Euler-Lagrange equations, a system of partial differential equations that are used to build geodesic curves in Riemannian manifolds. After that, by the application of the fourth-order Runge-Kutta method and Markov Chain Monte Carlo simulation, we numerically build geodesic curves starting from an arbitrary initial point in the manifold. The obtained results show that, when the system undergoes phase transitions, the geodesic curve obtained by time reversing the computational simulation diverges from the original curve, showing a strange effect that we called the geodesic dispersion phenomenon, which suggests that time irreversibility in random fields is related to the intrinsic geometry of their parametric space.
Due to the complexity of order statistics, the finite sample behaviour of robust statistics is generally not analytically solvable. While the Monte Carlo method can provide approximate solutions, its convergence rate is typically very slow, making the computational cost to achieve the desired accuracy unaffordable for ordinary users. In this paper, we propose an approach analogous to the Fourier transformation to decompose the finite sample structure of the uniform distribution. By obtaining sets of sequences that are consistent with parametric distributions for the first four sample moments, we can approximate the finite sample behavior of other estimators with significantly reduced computational costs. This article reveals the underlying structure of randomness and presents a novel approach to integrate multiple assumptions.
Association schemes play an important role in algebraic combinatorics and have important applications in coding theory, graph theory and design theory. The methods to construct association schemes by using bent functions have been extensively studied. Recently, in [13], {\"O}zbudak and Pelen constructed infinite families of symmetric association schemes of classes $5$ and $6$ by using ternary non-weakly regular bent functions.They also stated that constructing $2p$-class association schemes from $p$-ary non-weakly regular bent functions is an interesting problem, where $p>3$ is an odd prime. In this paper, using non-weakly regular bent functions, we construct infinite families of symmetric association schemes of classes $2p$, $(2p+1)$ and $\frac{3p+1}{2}$ for any odd prime $p$. Fusing those association schemes, we also obtain $t$-class symmetric association schemes, where $t=4,5,6,7$. In addition, we give the sufficient and necessary conditions for the partitions $P$, $D$, $T$, $U$ and $V$ (defined in this paper) to induce symmetric association schemes.
In probabilistic modelling, joint distributions are often of more interest than their marginals, but the standard composition of stochastic channels is defined by marginalization. Recently, the notion of 'copy-composition' was introduced in order to circumvent this problem and express the chain rule of the relative entropy fibrationally, but while that goal was achieved, copy-composition lacked a satisfactory origin story. Here, we supply such a story for two standard probabilistic tools: directed and undirected graphical models. We explain that (directed) Bayesian networks may be understood as "stochastic terms" of product type, in which context copy-composition amounts to a pull-push operation. Likewise, we show that (undirected) factor graphs compose by copy-composition. In each case, our construction yields a double fibration of decorated (co)spans. Along the way, we introduce a useful bifibration of measure kernels, to provide semantics for the notion of stochastic term, which allows us to generalize probabilistic modelling from product to dependent types.
Tree shape statistics, particularly measures of tree (im)balance, play an important role in the analysis of the shape of phylogenetic trees. With applications ranging from testing evolutionary models to studying the impact of fertility inheritance and selection, or tumor development and language evolution, the assessment of tree balance is crucial. Currently, a multitude of at least 30 (im)balance indices can be found in the literature, alongside numerous other tree shape statistics. This diversity prompts essential questions: How can we minimize the selection of indices to mitigate the challenges of multiple testing? Is there a preeminent balance index tailored to specific tasks? Previous studies comparing the statistical power of indices in detecting trees deviating from the Yule model have been limited in scope, utilizing only a subset of indices and alternative tree models. This research expands upon the examination of index power, encompassing all established indices and a broader array of alternative models. Our investigation reveals distinct groups of balance indices better suited for different tree models, suggesting that decisions on balance index selection can be enhanced with prior knowledge. Furthermore, we present the \textsf{R} software package \textsf{poweRbal} which allows the inclusion of new indices and models, thus facilitating future research.
Nonparametric estimators for the mean and the covariance functions of functional data are proposed. The setup covers a wide range of practical situations. The random trajectories are, not necessarily differentiable, have unknown regularity, and are measured with error at discrete design points. The measurement error could be heteroscedastic. The design points could be either randomly drawn or common for all curves. The estimators depend on the local regularity of the stochastic process generating the functional data. We consider a simple estimator of this local regularity which exploits the replication and regularization features of functional data. Next, we use the ``smoothing first, then estimate'' approach for the mean and the covariance functions. They can be applied with both sparsely or densely sampled curves, are easy to calculate and to update, and perform well in simulations. Simulations built upon an example of real data set, illustrate the effectiveness of the new approach.
Data sets of multivariate normal distributions abound in many scientific areas like diffusion tensor imaging, structure tensor computer vision, radar signal processing, machine learning, just to name a few. In order to process those normal data sets for downstream tasks like filtering, classification or clustering, one needs to define proper notions of dissimilarities between normals and paths joining them. The Fisher-Rao distance defined as the Riemannian geodesic distance induced by the Fisher information metric is such a principled metric distance which however is not known in closed-form excepts for a few particular cases. In this work, we first report a fast and robust method to approximate arbitrarily finely the Fisher-Rao distance between multivariate normal distributions. Second, we introduce a class of distances based on diffeomorphic embeddings of the normal manifold into a submanifold of the higher-dimensional symmetric positive-definite cone corresponding to the manifold of centered normal distributions. We show that the projective Hilbert distance on the cone yields a metric on the embedded normal submanifold and we pullback that cone distance with its associated straight line Hilbert cone geodesics to obtain a distance and smooth paths between normal distributions. Compared to the Fisher-Rao distance approximation, the pullback Hilbert cone distance is computationally light since it requires to compute only the extreme minimal and maximal eigenvalues of matrices. Finally, we show how to use those distances in clustering tasks.
Neural operators have emerged as a powerful tool for learning the mapping between infinite-dimensional parameter and solution spaces of partial differential equations (PDEs). In this work, we focus on multiscale PDEs that have important applications such as reservoir modeling and turbulence prediction. We demonstrate that for such PDEs, the spectral bias towards low-frequency components presents a significant challenge for existing neural operators. To address this challenge, we propose a hierarchical attention neural operator (HANO) inspired by the hierarchical matrix approach. HANO features a scale-adaptive interaction range and self-attentions over a hierarchy of levels, enabling nested feature computation with controllable linear cost and encoding/decoding of multiscale solution space. We also incorporate an empirical $H^1$ loss function to enhance the learning of high-frequency components. Our numerical experiments demonstrate that HANO outperforms state-of-the-art (SOTA) methods for representative multiscale problems.
A micromechanically motivated phase-field damage model is proposed to investigate the fracture behaviour in crosslinked polyurethane adhesive. The crosslinked polyurethane adhesive typically show viscoelastic behaviour with geometric nonlinearity. The finite-strain viscoelastic behaviour is modelled using a micromechanical network model considering shorter and longer chain length distribution. The micromechanical viscoelastic network model also consider the softening due to breakage/debonding of the short chains with increase in deformation. The micromechanical model is coupled with the phase-field damage model to investigate the crack initiation and propagation. Critical energy release rate is needed as a material property to solve phase-field equation. The energy release rate is formulated based on the polymer chain network. The numerical investigation is performed using finite element method. The force-displacement curves from the numerical analysis and experiments are compared to validate the proposed material model.
We study an interacting particle method (IPM) for computing the large deviation rate function of entropy production for diffusion processes, with emphasis on the vanishing-noise limit and high dimensions. The crucial ingredient to obtain the rate function is the computation of the principal eigenvalue $\lambda$ of elliptic, non-self-adjoint operators. We show that this principal eigenvalue can be approximated in terms of the spectral radius of a discretized evolution operator obtained from an operator splitting scheme and an Euler--Maruyama scheme with a small time step size, and we show that this spectral radius can be accessed through a large number of iterations of this discretized semigroup, suitable for the IPM. The IPM applies naturally to problems in unbounded domains, scales easily to high dimensions, and adapts to singular behaviors in the vanishing-noise limit. We show numerical examples in dimensions up to 16. The numerical results show that our numerical approximation of $\lambda$ converges to the analytical vanishing-noise limit within visual tolerance with a fixed number of particles and a fixed time step size. Our paper appears to be the first one to obtain numerical results of principal eigenvalue problems for non-self-adjoint operators in such high dimensions.
Multiscale problems are challenging for neural network-based discretizations of differential equations, such as physics-informed neural networks (PINNs). This can be (partly) attributed to the so-called spectral bias of neural networks. To improve the performance of PINNs for time-dependent problems, a combination of multifidelity stacking PINNs and domain decomposition-based finite basis PINNs is employed. In particular, to learn the high-fidelity part of the multifidelity model, a domain decomposition in time is employed. The performance is investigated for a pendulum and a two-frequency problem as well as the Allen-Cahn equation. It can be observed that the domain decomposition approach clearly improves the PINN and stacking PINN approaches. Finally, it is demonstrated that the FBPINN approach can be extended to multifidelity physics-informed deep operator networks.