Algebraic multigrid (AMG) methods are among the most efficient solvers for linear systems of equations and they are widely used for the solution of problems stemming from the discretization of Partial Differential Equations (PDEs). The most severe limitation of AMG methods is the dependence on parameters that require to be fine-tuned. In particular, the strong threshold parameter is the most relevant since it stands at the basis of the construction of successively coarser grids needed by the AMG methods. We introduce a novel Deep Learning algorithm that minimizes the computational cost of the AMG method when used as a finite element solver. We show that our algorithm requires minimal changes to any existing code. The proposed Artificial Neural Network (ANN) tunes the value of the strong threshold parameter by interpreting the sparse matrix of the linear system as a black-and-white image and exploiting a pooling operator to transform it into a small multi-channel image. We experimentally prove that the pooling successfully reduces the computational cost of processing a large sparse matrix and preserves the features needed for the regression task at hand. We train the proposed algorithm on a large dataset containing problems with a highly heterogeneous diffusion coefficient defined in different three-dimensional geometries and discretized with unstructured grids and linear elasticity problems with a highly heterogeneous Young's modulus. When tested on problems with coefficients or geometries not present in the training dataset, our approach reduces the computational time by up to 30%.
We present a novel method for recovering the absolute pose and shape of a human in a pre-scanned scene given a single image. Unlike previous methods that perform sceneaware mesh optimization, we propose to first estimate absolute position and dense scene contacts with a sparse 3D CNN, and later enhance a pretrained human mesh recovery network by cross-attention with the derived 3D scene cues. Joint learning on images and scene geometry enables our method to reduce the ambiguity caused by depth and occlusion, resulting in more reasonable global postures and contacts. Encoding scene-aware cues in the network also allows the proposed method to be optimization-free, and opens up the opportunity for real-time applications. The experiments show that the proposed network is capable of recovering accurate and physically-plausible meshes by a single forward pass and outperforms state-of-the-art methods in terms of both accuracy and speed.
A well-established approach for inferring full displacement and stress fields from possibly sparse data is to calibrate the parameter of a given constitutive model using a Bayesian update. After calibration, a (stochastic) forward simulation is conducted with the identified model parameters to resolve physical fields in regions that were not accessible to the measurement device. A shortcoming of model calibration is that the model is deemed to best represent reality, which is only sometimes the case, especially in the context of the aging of structures and materials. While this issue is often addressed with repeated model calibration, a different approach is followed in the recently proposed statistical Finite Element Method (statFEM). Instead of using Bayes' theorem to update model parameters, the displacement is chosen as the stochastic prior and updated to fit the measurement data more closely. For this purpose, the statFEM framework introduces a so-called model-reality mismatch, parametrized by only three hyperparameters. This makes the inference of full-field data computationally efficient in an online stage: If the stochastic prior can be computed offline, solving the underlying partial differential equation (PDE) online is unnecessary. Compared to solving a PDE, identifying only three hyperparameters and conditioning the state on the sensor data requires much fewer computational resources. This paper presents two contributions to the existing statFEM approach: First, we use a non-intrusive polynomial chaos method to compute the prior, enabling the use of complex mechanical models in deterministic formulations. Second, we examine the influence of prior material models (linear elastic and St.Venant Kirchhoff material with uncertain Young's modulus) on the updated solution. We present statFEM results for 1D and 2D examples, while an extension to 3D is straightforward.
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. With the introduction of the backpropagation algorithm and surrogate gradient, the structure of spiking neural networks has become more complex, and the performance gap with artificial neural networks has gradually decreased. However, most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements, which significantly restricts the development of SNNs. They do not delve into the arithmetic operations between the binary spikes and synaptic weights or assume unlimited on-chip RAM resources by using overly expensive devices on small tasks. To improve arithmetic efficiency, we analyze the neural dynamics of spiking neurons, generalize the SNN arithmetic operation to the multiplex-accumulate operation, and propose a high-performance implementation of such operation by utilizing the DSP48E2 hard block in Xilinx Ultrascale FPGAs. To improve memory efficiency, we design a memory system to enable efficient synaptic weights and membrane voltage memory access with reasonable on-chip RAM consumption. Combining the above two improvements, we propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly). FireFly is the first SNN accelerator that incorporates DSP optimization techniques into SNN synaptic operations. FireFly is implemented on several FPGA edge devices with limited resources but still guarantees a peak performance of 5.53TOP/s at 300MHz. As a lightweight accelerator, FireFly achieves the highest computational density efficiency compared with existing research using large FPGA devices.
In this paper we propose a new finite element method for solving elliptic optimal control problems with pointwise state constraints, including the distributed controls and the Dirichlet or Neumann boundary controls. The main idea is to use energy space regularizations in the objective functional, while the equivalent representations of the energy space norms, i.e., the $H^{-1}(\Omega)$-norm for the distributed control, the $H^{1/2}(\Gamma)$-norm for the Dirichlet control and the $H^{-1/2}(\Gamma)$-norm for the Neumann control, enable us to transform the optimal control problem into an elliptic variational inequality involving only the state variable. The elliptic variational inequalities are second order for the three cases, and include additional equality constraints for Dirichlet or Neumann boundary control problems. Standard $C^0$ finite elements can be used to solve the resulted variational inequality. We provide preliminary a priori error estimates for the new algorithm for solving distributed control problems. Extensive numerical experiments are carried out to validate the accuracy of the new algorithm.
While constraints arise naturally in many physical models, their treatment in mathematical and numerical models varies widely, depending on the nature of the constraint and the availability of simulation tools to enforce it. In this paper, we consider the solution of discretized PDE models that have a natural constraint on the positivity (or non-negativity) of the solution. While discretizations of such models often offer analogous positivity properties on their exact solutions, the use of approximate solution algorithms (and the unavoidable effects of floating -- point arithmetic) often destroy any guarantees that the computed approximate solution will satisfy the (discretized form of the) physical constraints, unless the discrete model is solved to much higher precision than discretization error would dictate. Here, we introduce a class of iterative solution algorithms, based on the unigrid variant of multigrid methods, where such positivity constraints can be preserved throughout the approximate solution process. Numerical results for one- and two-dimensional model problems show both the effectiveness of the approach and the trade-off required to ensure positivity of approximate solutions throughout the solution process.
Recently it has been shown that using diffusion models for inverse problems can lead to remarkable results. However, these approaches require a closed-form expression of the degradation model and can not support complex degradations. To overcome this limitation, we propose a method (INDigo) that combines invertible neural networks (INN) and diffusion models for general inverse problems. Specifically, we train the forward process of INN to simulate an arbitrary degradation process and use the inverse as a reconstruction process. During the diffusion sampling process, we impose an additional data-consistency step that minimizes the distance between the intermediate result and the INN-optimized result at every iteration, where the INN-optimized image is composed of the coarse information given by the observed degraded image and the details generated by the diffusion process. With the help of INN, our algorithm effectively estimates the details lost in the degradation process and is no longer limited by the requirement of knowing the closed-form expression of the degradation model. Experiments demonstrate that our algorithm obtains competitive results compared with recently leading methods both quantitatively and visually. Moreover, our algorithm performs well on more complex degradation models and real-world low-quality images.
Defect reconstruction is essential in non-destructive testing and structural health monitoring with guided ultrasonic waves. This paper presents an algorithm for reconstructing notches in steel plates which can be seen as artificial defects representing cracks by comparing measured results with those from a simulation model. The model contains a parameterized notch, and its geometrical parameters are to be reconstructed. While the algorithm is formulated and presented in a generalized form for many different defect types, a special case of guided wave propagation is used to investigate one of the simplest possible simulation models that discretizes only the cross-section of the steel plate. An efficient simulation model of the plate cross-section is obtained by the semi-analytical Scaled Boundary Finite Element Method. The reconstruction algorithm applied is gradient-based, and Algorithmic Differentiation calculates the gradient. The dedicated experimental setup excites nearly plane wave fronts propagating orthogonal to the notch. A scanning Laser Doppler Vibrometer records the velocity field at certain points on the plate surface as input to the reconstruction algorithm. Using two plates with notches of different depths, it is demonstrated that accurate geometry reconstruction is possible.
Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties. We also discuss potential future directions in this field.
Gaussian processes (GPs) based methods for solving partial differential equations (PDEs) demonstrate great promise by bridging the gap between the theoretical rigor of traditional numerical algorithms and the flexible design of machine learning solvers. The main bottleneck of GP methods lies in the inversion of a covariance matrix, whose cost grows cubically concerning the size of samples. Drawing inspiration from neural networks, we propose a mini-batch algorithm combined with GPs to solve nonlinear PDEs. The algorithm takes a mini-batch of samples at each step to update the GP model. Thus, the computational cost is allotted to each iteration. Using stability analysis and convexity arguments, we show that the mini-batch method steadily reduces a natural measure of errors towards zero at the rate of O(1/K + 1/M), where K is the number of iterations and M is the batch size. Numerical results show that smooth problems benefit from a small batch size, while less regular problems require careful sample selection for optimal accuracy.
Deep neural networks have achieved remarkable success in computer vision tasks. Existing neural networks mainly operate in the spatial domain with fixed input sizes. For practical applications, images are usually large and have to be downsampled to the predetermined input size of neural networks. Even though the downsampling operations reduce computation and the required communication bandwidth, it removes both redundant and salient information obliviously, which results in accuracy degradation. Inspired by digital signal processing theories, we analyze the spectral bias from the frequency perspective and propose a learning-based frequency selection method to identify the trivial frequency components which can be removed without accuracy loss. The proposed method of learning in the frequency domain leverages identical structures of the well-known neural networks, such as ResNet-50, MobileNetV2, and Mask R-CNN, while accepting the frequency-domain information as the input. Experiment results show that learning in the frequency domain with static channel selection can achieve higher accuracy than the conventional spatial downsampling approach and meanwhile further reduce the input data size. Specifically for ImageNet classification with the same input size, the proposed method achieves 1.41% and 0.66% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.