The goal of the present paper is to understand the impact of numerical schemes for the reconstruction of data at cell faces in finite-volume methods, and to assess their interaction with the quadrature rule used to compute the average over the cell volume. Here, third-, fifth- and seventh-order WENO-Z schemes are investigated. On a problem with a smooth solution, the theoretical order of convergence rate for each method is retrieved, and changing the order of the reconstruction at cell faces does not impact the results, whereas for a shock-driven problem all the methods collapse to first-order. Study of the decay of compressible homogeneous isotropic turbulence reveals that using a high-order quadrature rule to compute the average over a finite volume cell does not improve the spectral accuracy and that all methods present a second-order convergence rate. However the choice of the numerical method to reconstruct data at cell faces is found to be critical to correctly capture turbulent spectra. In the context of simulations with finite-volume methods of practical flows encountered in engineering applications, it becomes apparent that an efficient strategy is to perform the average integration with a low-order quadrature rule on a fine mesh resolution, whereas high-order schemes should be used to reconstruct data at cell faces.
In recent years, the connections between deep residual networks and first-order Ordinary Differential Equations (ODEs) have been disclosed. In this work, we further bridge the deep neural architecture design with the second-order ODEs and propose a novel reversible neural network, termed as m-RevNet, that is characterized by inserting momentum update to residual blocks. The reversible property allows us to perform backward pass without access to activation values of the forward pass, greatly relieving the storage burden during training. Furthermore, the theoretical foundation based on second-order ODEs grants m-RevNet with stronger representational power than vanilla residual networks, which potentially explains its performance gains. For certain learning scenarios, we analytically and empirically reveal that our m-RevNet succeeds while standard ResNet fails. Comprehensive experiments on various image classification and semantic segmentation benchmarks demonstrate the superiority of our m-RevNet over ResNet, concerning both memory efficiency and recognition performance.
The following paper presents two simulation strategies for compressible two-phase or multicomponent flows. One is a full non-equilibrium model in which the pressure and velocity are driven towards the equilibrium at interfaces by numerical relaxation processes, the second is a four-equation model that assumes stiff mechanical and thermal equilibrium between phases or components. In both approaches, the thermodynamic behaviour of each fluid is modelled independently according to the stiffened-gas equations of state. The presented methods are used to simulate the de-pressurization of a pipe containing pure CO$_2$ liquid and vapour under the one-dimensional approximation.
In this paper, we propose and analyze a temporally second-order accurate, fully discrete finite element method for the magnetohydrodynamic (MHD) equations. A modified Crank--Nicolson method is used to discretize the model and appropriate semi-implicit treatments are applied to the fluid convection term and two coupling terms. These semi-implicit approximations result in a linear system with variable coefficients for which the unique solvability can be proved theoretically. In addition, we use a decoupling projection method of the Van Kan type \cite{vankan1986} in the Stokes solver, which computes the intermediate velocity field based on the gradient of the pressure from the previous time level, and enforces the incompressibility constraint via the Helmholtz decomposition of the intermediate velocity field. The energy stability of the scheme is theoretically proved, in which the decoupled Stokes solver needs to be analyzed in details. Optimal-order convergence of $\mathcal{O} (\tau^2+h^{r+1})$ in the discrete $L^\infty(0,T;L^2)$ norm is proved for the proposed decoupled projection finite element scheme, where $\tau$ and $h$ are the time stepsize and spatial mesh size, respectively, and $r$ is the degree of the finite elements. Existing error estimates of second-order projection methods of the Van Kan type \cite{vankan1986} were only established in the discrete $L^2(0,T;L^2)$ norm for the Navier--Stokes equations. Numerical examples are provided to illustrate the theoretical results.
In this paper, we clarify reconstruction-based discretization schemes for unstructured grids and discuss their economically high-order versions, which can achieve high-order accuracy under certain conditions at little extra cost. The clarification leads to one of the most economical approaches: the flux-and-solution-reconstruction (FSR) approach, where highly economical schemes can be constructed based on an extended kappa-scheme combined with economical flux reconstruction formulas, achieving up to fifth-order accuracy (sixth-order with zero dissipation) when a grid is regular. Various economical FSR schemes are presented and their formal orders of accuracy are verified by numerical experiments.
Despite the many advances in the use of weakly-compressible smoothed particle hydrodynamics (SPH) for the simulation of incompressible fluid flow, it is still challenging to obtain second-order convergence numerically. In this paper we perform a systematic numerical study of convergence and accuracy of kernel-based approximation, discretization operators, and weakly-compressible SPH (WCSPH) schemes. We explore the origins of the errors and issues preventing second-order convergence. Based on the study, we propose several new variations of the basic WCSPH scheme that are all second-order accurate. Additionally, we investigate the linear and angular momentum conservation property of the WCSPH schemes. Our results show that one may construct accurate WCSPH schemes that demonstrate second-order convergence through a judicious choice of kernel, smoothing length, and discretization operators in the discretization of the governing equations.
We consider the problem of approximating a function in general nonlinear subsets of $L^2$ when only a weighted Monte Carlo estimate of the $L^2$-norm can be computed. Of particular interest in this setting is the concept of sample complexity, the number of samples that are necessary to recover the best approximation. Bounds for this quantity have been derived in a previous work and depend primarily on the model class and are not influenced positively by the regularity of the sought function. This result however is only a worst-case bound and is not able to explain the remarkable performance of iterative hard thresholding algorithms that is observed in practice. We reexamine the results of the previous paper and derive a new bound that is able to utilize the regularity of the sought function. A critical analysis of our results allows us to derive a sample efficient algorithm for the model set of low-rank tensors. The viability of this algorithm is demonstrated by recovering quantities of interest for a classical high-dimensional random partial differential equation.
By improving the trace finite element method, we developed another higher-order trace finite element method by integrating on the surface with exact geometry description. This method restricts the finite element space on the volume mesh to the surface accurately, and approximates Laplace-Beltrami operator on the surface by calculating the high-order numerical integration on the exact surface directly. We employ this method to calculate the Laplace-Beltrami equation and the Laplace-Beltrami eigenvalue problem. Numerical error analysis shows that this method has an optimal convergence order in both problems. Numerical experiments verify the correctness of the theoretical analysis. The algorithm is more accurate and easier to implement than the existing high-order trace finite element method.
Gradient descent (GD) type optimization methods are the standard instrument to train artificial neural networks (ANNs) with rectified linear unit (ReLU) activation. Despite the great success of GD type optimization methods in numerical simulations for the training of ANNs with ReLU activation, it remains - even in the simplest situation of the plain vanilla GD optimization method with random initializations and ANNs with one hidden layer - an open problem to prove (or disprove) the conjecture that the risk of the GD optimization method converges in the training of such ANNs to zero as the width of the ANNs, the number of independent random initializations, and the number of GD steps increase to infinity. In this article we prove this conjecture in the situation where the probability distribution of the input data is equivalent to the continuous uniform distribution on a compact interval, where the probability distributions for the random initializations of the ANN parameters are standard normal distributions, and where the target function under consideration is continuous and piecewise affine linear. Roughly speaking, the key ingredients in our mathematical convergence analysis are (i) to prove that suitable sets of global minima of the risk functions are \emph{twice continuously differentiable submanifolds of the ANN parameter spaces}, (ii) to prove that the Hessians of the risk functions on these sets of global minima satisfy an appropriate \emph{maximal rank condition}, and, thereafter, (iii) to apply the machinery in [Fehrman, B., Gess, B., Jentzen, A., Convergence rates for the stochastic gradient descent method for non-convex objective functions. J. Mach. Learn. Res. 21(136): 1--48, 2020] to establish convergence of the GD optimization method with random initializations.
Stochastic variance reduced gradient (SVRG) is a popular variance reduction technique for stochastic gradient descent (SGD). We provide a first analysis of the method for solving a class of linear inverse problems in the lens of the classical regularization theory. We prove that for a suitable constant step size schedule, the method can achieve an optimal convergence rate in terms of the noise level (under suitable regularity condition) and the variance of the SVRG iterate error is smaller than that by SGD. These theoretical findings are corroborated by a set of numerical experiments.
Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.