The multiscale hybrid-mixed (MHM) method consists of a multi-level strategy to approximate the solution of boundary value problems with heterogeneous coefficients. In this context, we propose a family of low-order finite elements for the linear elasticity problem which are free from Poisson locking. The finite elements rely on face degrees of freedom associated with multiscale bases obtained from local Neumann problems with piecewise polynomial interpolations on faces. We establish sufficient refinement levels on the fine-scale mesh such that the MHM method is well-posed, optimally convergent under local regularity conditions, and locking-free. Two-dimensional numerical tests assess theoretical results.
We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence behaves like an inner product between the measures $P_1 - P_0$ and $P_2 - P_0$. Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the $\chi^2$-codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the $\chi^2$-divergence matrix satisfies a data-processing inequality.
When modeling a vector of risk variables, extreme scenarios are often of special interest. The peaks-over-thresholds method hinges on the notion that, asymptotically, the excesses over a vector of high thresholds follow a multivariate generalized Pareto distribution. However, existing literature has primarily concentrated on the setting when all risk variables are always large simultaneously. In reality, this assumption is often not met, especially in high dimensions. In response to this limitation, we study scenarios where distinct groups of risk variables may exhibit joint extremes while others do not. These discernible groups are derived from the angular measure inherent in the corresponding max-stable distribution, whence the term extreme direction. We explore such extreme directions within the framework of multivariate generalized Pareto distributions, with a focus on their probability density functions in relation to an appropriate dominating measure. Furthermore, we provide a stochastic construction that allows any prespecified set of risk groups to constitute the distribution's extreme directions. This construction takes the form of a smoothed max-linear model and accommodates the full spectrum of conceivable max-stable dependence structures. Additionally, we introduce a generic simulation algorithm tailored for multivariate generalized Pareto distributions, offering specific implementations for extensions of the logistic and H\"usler-Reiss families capable of carrying arbitrary extreme directions.
We propose a computationally efficient and systematically convergent approach for elastodynamics simulations. We recast the second-order dynamical equation of elastodynamics into an equivalent first-order system of coupled equations, so as to express the solution in the form of a Magnus expansion. With any spatial discretization, it entails computing the exponential of a matrix acting upon a vector. We employ an adaptive Krylov subspace approach to inexpensively and and accurately evaluate the action of the exponential matrix on a vector. In particular, we use an apriori error estimate to predict the optimal Kyrlov subspace size required for each time-step size. We show that the Magnus expansion truncated after its first term provides quadratic and superquadratic convergence in the time-step for nonlinear and linear elastodynamics, respectively. We demonstrate the accuracy and efficiency of the proposed method for one linear (linear cantilever beam) and three nonlinear (nonlinear cantilever beam, soft tissue elastomer, and hyperelastic rubber) benchmark systems. For a desired accuracy in energy, displacement, and velocity, our method allows for $10-100\times$ larger time-steps than conventional time-marching schemes such as Newmark-$\beta$ method. Computationally, it translates to a $\sim$$1000\times$ and $\sim$$10-100\times$ speed-up over conventional time-marching schemes for linear and nonlinear elastodynamics, respectively.
Popular artificial neural networks (ANN) optimize parameters for unidirectional value propagation, assuming some guessed parametrization type like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). In contrast, for biological neurons e.g. "it is not uncommon for axonal propagation of action potentials to happen in both directions" \cite{axon} - suggesting they are optimized to continuously operate in multidirectional way. Additionally, statistical dependencies a single neuron could model is not just (expected) value dependence, but entire joint distributions including also higher moments. Such agnostic joint distribution neuron would allow for multidirectional propagation (of distributions or values) e.g. $\rho(x|y,z)$ or $\rho(y,z|x)$ by substituting to $\rho(x,y,z)$ and normalizing. There will be discussed Hierarchical Correlation Reconstruction (HCR) for such neuron model: assuming $\rho(x,y,z)=\sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$ type parametrization of joint distribution with polynomial basis $f_i$, which allows for flexible, inexpensive processing including nonlinearities, direct model estimation and update, trained through standard backpropagation or novel ways for such structure up to tensor decomposition. Using only pairwise (input-output) dependencies, its expected value prediction becomes KAN-like with trained activation functions as polynomials, can be extended by adding higher order dependencies through included products - in conscious interpretable way, allowing for multidirectional propagation of both values and probability densities.
This paper presents an analysis of properties of two hybrid discretization methods for Gaussian derivatives, based on convolutions with either the normalized sampled Gaussian kernel or the integrated Gaussian kernel followed by central differences. The motivation for studying these discretization methods is that in situations when multiple spatial derivatives of different order are needed at the same scale level, they can be computed significantly more efficiently compared to more direct derivative approximations based on explicit convolutions with either sampled Gaussian kernels or integrated Gaussian kernels. While these computational benefits do also hold for the genuinely discrete approach for computing discrete analogues of Gaussian derivatives, based on convolution with the discrete analogue of the Gaussian kernel followed by central differences, the underlying mathematical primitives for the discrete analogue of the Gaussian kernel, in terms of modified Bessel functions of integer order, may not be available in certain frameworks for image processing, such as when performing deep learning based on scale-parameterized filters in terms of Gaussian derivatives, with learning of the scale levels. In this paper, we present a characterization of the properties of these hybrid discretization methods, in terms of quantitative performance measures concerning the amount of spatial smoothing that they imply, as well as the relative consistency of scale estimates obtained from scale-invariant feature detectors with automatic scale selection, with an emphasis on the behaviour for very small values of the scale parameter, which may differ significantly from corresponding results obtained from the fully continuous scale-space theory, as well as between different types of discretization methods.
Adaptive finite elements combined with geometric multigrid solvers are one of the most efficient numerical methods for problems such as the instationary Navier-Stokes equations. Yet despite their efficiency, computations remain expensive and the simulation of, for example, complex flow problems can take many hours or days. GPUs provide an interesting avenue to speed up the calculations due to their very large theoretical peak performance. However, the large degree of parallelism and non-standard API make the use of GPUs in scientific computing challenging. In this work, we develop a GPU acceleration for the adaptive finite element library Gascoigne and study its effectiveness for different systems of partial differential equations. Through the systematic formulation of all computations as linear algebra operations, we can employ GPU-accelerated linear algebra libraries, which simplifies the implementation and ensures the maintainability of the code while achieving very efficient GPU utilizations. Our results for a transport-diffusion equation, linear elasticity, and the instationary Navier-Stokes equations show substantial speedups of up to 20X compared to multi-core CPU implementations.
Partial differential equations (PDEs) involving high contrast and oscillating coefficients are common in scientific and industrial applications. Numerical approximation of these PDEs is a challenging task that can be addressed, for example, by multi-scale finite element analysis. For linear problems, multi-scale finite element method (MsFEM) is well established and some viable extensions to non-linear PDEs are known. However, some features of the method seem to be intrinsically based on linearity-based. In particular, traditional MsFEM rely on the reuse of computations. For example, the stiffness matrix can be calculated just once, while being used for several right-hand sides, or as part of a multi-level iterative algorithm. Roughly speaking, the offline phase of the method amounts to pre-assembling the local linear Dirichlet-to-Neumann (DtN) operators. We present some preliminary results concerning the combination of MsFEM with machine learning tools. The extension of MsFEM to nonlinear problems is achieved by means of learning local nonlinear DtN maps. The resulting learning-based multi-scale method is tested on a set of model nonlinear PDEs involving the $p-$Laplacian and degenerate nonlinear diffusion.
The multigrid V-cycle method is a popular method for solving systems of linear equations. It computes an approximate solution by using smoothing on fine levels and solving a system of linear equations on the coarsest level. Solving on the coarsest level depends on the size and difficulty of the problem. If the size permits, it is typical to use a direct method based on LU or Cholesky decomposition. In settings with large coarsest-level problems, approximate solvers such as iterative Krylov subspace methods, or direct methods based on low-rank approximation, are often used. The accuracy of the coarsest-level solver is typically determined based on the experience of the users with the concrete problems and methods. In this paper we present an approach to analyzing the effects of approximate coarsest-level solves on the convergence of the V-cycle method for symmetric positive definite problems. Using these results, we derive coarsest-level stopping criterion through which we may control the difference between the approximation computed by a V-cycle method with approximate coarsest-level solver and the approximation which would be computed if the coarsest-level problems were solved exactly. The coarsest-level stopping criterion may thus be set up such that the V-cycle method converges to a chosen finest-level accuracy in (nearly) the same number of V-cycle iterations as the V-cycle method with exact coarsest-level solver. We also utilize the theoretical results to discuss how the convergence of the V-cycle method may be affected by the choice of a tolerance in a coarsest-level stopping criterion based on the relative residual norm.
Semi-algebraic priors are ubiquitous in signal processing and machine learning. Prevalent examples include a) linear models where the signal lies in a low-dimensional subspace; b) sparse models where the signal can be represented by only a few coefficients under a suitable basis; and c) a large family of neural network generative models. In this paper, we prove a transversality theorem for semi-algebraic sets in orthogonal or unitary representations of groups: with a suitable dimension bound, a generic translate of any semi-algebraic set is transverse to the orbits of the group action. This, in turn, implies that if a signal lies in a low-dimensional semi-algebraic set, then it can be recovered uniquely from measurements that separate orbits. As an application, we consider the implications of the transversality theorem to the problem of recovering signals that are translated by random group actions from their second moment. As a special case, we discuss cryo-EM: a leading technology to constitute the spatial structure of biological molecules, which serves as our prime motivation. In particular, we derive explicit bounds for recovering a molecular structure from the second moment under a semi-algebraic prior and deduce information-theoretic implications. We also obtain information-theoretic bounds for three additional applications: factoring Gram matrices, multi-reference alignment, and phase retrieval. Finally, we deduce bounds for designing permutation invariant separators in machine learning.
This manuscript derives locally weighted ensemble Kalman methods from the point of view of ensemble-based function approximation. This is done by using pointwise evaluations to build up a local linear or quadratic approximation of a function, tapering off the effect of distant particles via local weighting. This introduces a candidate method (the locally weighted Ensemble Kalman method for inversion) with the motivation of combining some of the strengths of the particle filter (ability to cope with nonlinear maps and non-Gaussian distributions) and the Ensemble Kalman filter (no filter degeneracy).