For high-order accurate schemes such as discontinuous Galerkin (DG) methods solving Fokker-Planck equations, it is desired to efficiently enforce positivity without losing conservation and high-order accuracy, especially for implicit time discretizations. We consider an optimization-based positivity-preserving limiter for enforcing positivity of cell averages of DG solutions in a semi-implicit time discretization scheme, so that the point values can be easily enforced to be positive by a simple scaling limiter on the DG polynomial in each cell. The optimization can be efficiently solved by a first-order splitting method with nearly optimal parameters, which has an $\mathcal{O}(N)$ computational complexity and is flexible for parallel computation. Numerical tests are shown on some representative examples to demonstrate the performance of the proposed method.
Recently, the ParaOpt algorithm was proposed as an extension of the time-parallel Parareal method to optimal control. ParaOpt uses quasi-Newton steps that each require solving a system of matching conditions iteratively. The state-of-the-art parallel preconditioner for linear problems leads to a set of independent smaller systems that are currently hard to solve. We generalize the preconditioner to the nonlinear case and propose a new, fast inversion method for these smaller systems, avoiding disadvantages of the current options with adjusted boundary conditions in the subproblems.
Numerical resolution of high-dimensional nonlinear PDEs remains a huge challenge due to the curse of dimensionality. Starting from the weak formulation of the Lawson-Euler scheme, this paper proposes a stochastic particle method (SPM) by tracking the deterministic motion, random jump, resampling and reweighting of particles. Real-valued weighted particles are adopted by SPM to approximate the high-dimensional solution, which automatically adjusts the point distribution to intimate the relevant feature of the solution. A piecewise constant reconstruction with virtual uniform grid is employed to evaluate the nonlinear terms, which fully exploits the intrinsic adaptive characteristic of SPM. Combining both, SPM can achieve the goal of adaptive sampling in time. Numerical experiments on the 6-D Allen-Cahn equation and the 7-D Hamiltonian-Jacobi-Bellman equation demonstrate the potential of SPM in solving high-dimensional nonlinear PDEs efficiently while maintaining an acceptable accuracy.
In this paper, we propose a machine learning (ML)-based moment closure model for the linearized Boltzmann equation of semiconductor devices, addressing both the deterministic and stochastic settings. Our approach leverages neural networks to learn the spatial gradient of the unclosed highest-order moment, enabling effective training through natural output normalization. For the deterministic problem, to ensure global hyperbolicity and stability, we derive and apply the constraints that enforce symmetrizable hyperbolicity of the system. For the stochastic problem, we adopt the generalized polynomial chaos (gPC)-based stochastic Galerkin method to discretize the random variables, resulting in a system for which the approach in the deterministic case can be used similarly. Several numerical experiments will be shown to demonstrate the effectiveness and accuracy of our ML-based moment closure model for the linear semiconductor Boltzmann equation with (or without) uncertainties.
Many partial differential equations (PDEs) such as Navier--Stokes equations in fluid mechanics, inelastic deformation in solids, and transient parabolic and hyperbolic equations do not have an exact, primal variational structure. Recently, a variational principle based on the dual (Lagrange multiplier) field was proposed. The essential idea in this approach is to treat the given PDE as constraints, and to invoke an arbitrarily chosen auxiliary potential with strong convexity properties to be optimized. This leads to requiring a convex dual functional to be minimized subject to Dirichlet boundary conditions on dual variables, with the guarantee that even PDEs that do not possess a variational structure in primal form can be solved via a variational principle. The vanishing of the first variation of the dual functional is, up to Dirichlet boundary conditions on dual fields, the weak form of the primal PDE problem with the dual-to-primal change of variables incorporated. We derive the dual weak form for the linear, one-dimensional, transient convection-diffusion equation. A Galerkin discretization is used to obtain the discrete equations, with the trial and test functions chosen as linear combination of either RePU activation functions (shallow neural network) or B-spline basis functions; the corresponding stiffness matrix is symmetric. For transient problems, a space-time Galerkin implementation is used with tensor-product B-splines as approximating functions. Numerical results are presented for the steady-state and transient convection-diffusion equation, and transient heat conduction. The proposed method delivers sound accuracy for ODEs and PDEs and rates of convergence are established in the $L^2$ norm and $H^1$ seminorm for the steady-state convection-diffusion problem.
The SE and DE formulas are known as efficient quadrature formulas for integrals with endpoint singularities. Especially for integrals with algebraic singularity, explicit error bounds in a computable form have been given, which are useful for computation with guaranteed accuracy. Such explicit error bounds have also given for integrals with logarithmic singularity. However, the error bounds have two points to be discussed. The first point is on overestimation of divergence speed of logarithmic singularity. The second point is on the case where there exist both logarithmic and algebraic singularity. To remedy these points, this study provides new error bounds for integrals with logarithmic and algebraic singularity. Although existing and new error bounds described above handle integrals over the finite interval, the SE and DE formulas may be applied to integrals over the semi-infinite interval. On the basis of the new results, this study provides new error bounds for integrals over the semi-infinite interval with logarithmic and algebraic singularity at the origin.
A numerical method ADER-DG with a local DG predictor for solving a DAE system has been developed, which was based on the formulation of ADER-DG methods using a local DG predictor for solving ODE and PDE systems. The basis functions were chosen in the form of Lagrange interpolation polynomials with nodal points at the roots of the Radau polynomials, which differs from the classical formulations of the ADER-DG method, where it is customary to use the roots of Legendre polynomials. It was shown that the use of this basis leads to A-stability and L1-stability in the case of using the DAE solver as ODE solver. The numerical method ADER-DG allows one to obtain a highly accurate numerical solution even on very coarse grids, with a step greater than the main characteristic scale of solution variation. The local discrete time solution can be used as a numerical solution of the DAE system between grid nodes, thereby providing subgrid resolution even in the case of very coarse grids. The classical test examples were solved by developed numerical method ADER-DG. With increasing index of the DAE system, a decrease in the empirical convergence orders p is observed. An unexpected result was obtained in the numerical solution of the stiff DAE system -- the empirical convergence orders of the numerical solution obtained using the developed method turned out to be significantly higher than the values expected for this method in the case of stiff problems. It turns out that the use of Lagrange interpolation polynomials with nodal points at the roots of the Radau polynomials is much better suited for solving stiff problems. Estimates showed that the computational costs of the ADER-DG method are approximately comparable to the computational costs of implicit Runge-Kutta methods used to solve DAE systems. Methods were proposed to reduce the computational costs of the ADER-DG method.
Block majorization-minimization (BMM) is a simple iterative algorithm for constrained nonconvex optimization that sequentially minimizes majorizing surrogates of the objective function in each block while the others are held fixed. BMM entails a large class of optimization algorithms such as block coordinate descent and its proximal-point variant, expectation-minimization, and block projected gradient descent. We first establish that for general constrained nonsmooth nonconvex optimization, BMM with $\rho$-strongly convex and $L_g$-smooth surrogates can produce an $\epsilon$-approximate first-order optimal point within $\widetilde{O}((1+L_g+\rho^{-1})\epsilon^{-2})$ iterations and asymptotically converges to the set of first-order optimal points. Next, we show that BMM combined with trust-region methods with diminishing radius has an improved complexity of $\widetilde{O}((1+L_g) \epsilon^{-2})$, independent of the inverse strong convexity parameter $\rho^{-1}$, allowing improved theoretical and practical performance with `flat' surrogates. Our results hold robustly even when the convex sub-problems are solved as long as the optimality gaps are summable. Central to our analysis is a novel continuous first-order optimality measure, by which we bound the worst-case sub-optimality in each iteration by the first-order improvement the algorithm makes. We apply our general framework to obtain new results on various algorithms such as the celebrated multiplicative update algorithm for nonnegative matrix factorization by Lee and Seung, regularized nonnegative tensor decomposition, and the classical block projected gradient descent algorithm. Lastly, we numerically demonstrate that the additional use of diminishing radius can improve the convergence rate of BMM in many instances.
In this paper, in order to improve the spatial accuracy, the exponential integrator Fourier Galerkin method (EIFG) is proposed for solving semilinear parabolic equations in rectangular domains. In this proposed method, the spatial discretization is first carried out by the Fourier-based Galerkin approximation, and then the time integration of the resulting semi-discrete system is approximated by the explicit exponential Runge-Kutta approach, which leads to the fully-discrete numerical solution. With certain regularity assumptions on the model problem, error estimate measured in $H^2$-norm is explicitly derived for EIFG method with two RK stages. Several two and three dimensional examples are shown to demonstrate the excellent performance of EIFG method, which are coincident to the theoretical results.
In this paper we introduce a multilevel Picard approximation algorithm for general semilinear parabolic PDEs with gradient-dependent nonlinearities whose coefficient functions do not need to be constant. We also provide a full convergence and complexity analysis of our algorithm. To obtain our main results, we consider a particular stochastic fixed-point equation (SFPE) motivated by the Feynman-Kac representation and the Bismut-Elworthy-Li formula. We show that the PDE under consideration has a unique viscosity solution which coincides with the first component of the unique solution of the stochastic fixed-point equation. Moreover, the gradient of the unique viscosity solution of the PDE exists and coincides with the second component of the unique solution of the stochastic fixed-point equation. Furthermore, we also provide a numerical example in up to $300$ dimensions to demonstrate the practical applicability of our multilevel Picard algorithm.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.