In this paper, we present an interior penalty discontinuous Galerkin finite element scheme for solving diffusion problems with strong anisotropy arising in magnetized plasmas for fusion applications. We demonstrate the accuracy produced by the high-order scheme and develop an efficient preconditioning technique to solve the corresponding linear system, which is robust to the mesh size and anisotropy of the problem. Several numerical tests are provided to validate the accuracy and efficiency of the proposed algorithm.
In this work, we investigate the recovery of a parameter in a diffusion process given by the order of derivation in time for a class of diffusion type equations, including both classical and time-fractional diffusion equations, from the flux measurement observed at one point on the boundary. The mathematical model for time-fractional diffusion equations involves a Djrbashian-Caputo fractional derivative in time. We prove a uniqueness result in an unknown medium (e.g., diffusion coefficients, obstacle, initial condition and source), i.e., the recovery of the order of derivation in a diffusion process having several pieces of unknown information. The proof relies on the analyticity of the solution at large time, asymptotic decay behavior, strong maximum principle of the elliptic problem and suitable application of the Hopf lemma. Further we provide an easy-to-implement reconstruction algorithm based on a nonlinear least-squares formulation, and several numerical experiments are presented to complement the theoretical analysis.
Numerical models of weather and climate critically depend on long-term stability of integrators for systems of hyperbolic conservation laws. While such stability is often obtained from (physical or numerical) dissipation terms, physical fidelity of such simulations also depends on properly preserving conserved quantities, such as energy, of the system. To address this apparent paradox, we develop a variational integrator for the shallow water equations that conserves energy, but dissipates potential enstrophy. Our approach follows the continuous selective decay framework [F. Gay-Balmaz and D. Holm. Selective decay by Casimir dissipation in inviscid fluids. Nonlinearity, 26(2):495, 2013], which enables dissipating an otherwise conserved quantity while conserving the total energy. We use this in combination with the variational discretization method [D. Pavlov, P. Mullen, Y. Tong, E. Kanso, J. Marsden and M. Desbrun. Structure-preserving discretization of incompressible fluids. Physica D: Nonlinear Phenomena, 240(6):443-458, 2011] to obtain a discrete selective decay framework. This is applied to the shallow water equations, both in the plane and on the sphere, to dissipate the potential enstrophy. The resulting scheme significantly improves the quality of the approximate solutions, enabling long-term integrations to be carried out.
Recently, the information-theoretical framework has been proven to be able to obtain non-vacuous generalization bounds for large models trained by Stochastic Gradient Langevin Dynamics (SGLD) with isotropic noise. In this paper, we optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance if both the prior and the posterior are jointly optimized. This validates that the optimal noise is quite close to the empirical gradient covariance. Technically, we develop a new information-theoretical bound that enables such an optimization analysis. We then apply matrix analysis to derive the form of optimal noise covariance. Presented constraint and results are validated by the empirical observations.
When approximating the expectation of a functional of a stochastic process, the efficiency and performance of deterministic quadrature methods, such as sparse grid quadrature and quasi-Monte Carlo (QMC) methods, may critically depend on the regularity of the integrand. To overcome this issue and reveal the available regularity, we consider cases in which analytic smoothing cannot be performed, and introduce a novel numerical smoothing approach by combining a root finding algorithm with one-dimensional integration with respect to a single well-selected variable. We prove that under appropriate conditions, the resulting function of the remaining variables is a highly smooth function, potentially affording the improved efficiency of adaptive sparse grid quadrature (ASGQ) and QMC methods, particularly when combined with hierarchical transformations (i.e., Brownian bridge and Richardson extrapolation on the weak error). This approach facilitates the effective treatment of high dimensionality. Our study is motivated by option pricing problems, and our focus is on dynamics where the discretization of the asset price is necessary. Based on our analysis and numerical experiments, we show the advantages of combining numerical smoothing with the ASGQ and QMC methods over ASGQ and QMC methods without smoothing and the Monte Carlo approach.
We present augmented Lagrangian Schur complement preconditioners and robust multigrid methods for incompressible Stokes problems with extreme viscosity variations. Such Stokes systems arise, for instance, upon linearization of nonlinear viscous flow problems, and they can have severely inhomogeneous and anisotropic coefficients. Using an augmented Lagrangian formulation for the incompressibility constraint makes the Schur complement easier to approximate, but results in a nearly singular (1,1)-block in the Stokes system. We present eigenvalue estimates for the quality of the Schur complement approximation. To cope with the near-singularity of the (1,1)-block, we extend a multigrid scheme with a discretization-dependent smoother and transfer operators from triangular/tetrahedral to the quadrilateral/hexahedral finite element discretizations $[\mathbb{Q}_k]^d\times \mathbb{P}_{k-1}^{\text{disc}}$, $k\geq 2$, $d=2,3$. Using numerical examples with scalar and with anisotropic fourth-order tensor viscosity arising from linearization of a viscoplastic constitutive relation, we confirm the robustness of the multigrid scheme and the overall efficiency of the solver. We present scalability results using up to 28,672 parallel tasks for problems with up to 1.6 billion unknowns and a viscosity contrast up to ten orders of magnitude.
Certified robustness is a desirable property for deep neural networks in safety-critical applications, and popular training algorithms can certify robustness of a neural network by computing a global bound on its Lipschitz constant. However, such a bound is often loose: it tends to over-regularize the neural network and degrade its natural accuracy. A tighter Lipschitz bound may provide a better tradeoff between natural and certified accuracy, but is generally hard to compute exactly due to non-convexity of the network. In this work, we propose an efficient and trainable \emph{local} Lipschitz upper bound by considering the interactions between activation functions (e.g. ReLU) and weight matrices. Specifically, when computing the induced norm of a weight matrix, we eliminate the corresponding rows and columns where the activation function is guaranteed to be a constant in the neighborhood of each given data point, which provides a provably tighter bound than the global Lipschitz constant of the neural network. Our method can be used as a plug-in module to tighten the Lipschitz bound in many certifiable training algorithms. Furthermore, we propose to clip activation functions (e.g., ReLU and MaxMin) with a learnable upper threshold and a sparsity loss to assist the network to achieve an even tighter local Lipschitz bound. Experimentally, we show that our method consistently outperforms state-of-the-art methods in both clean and certified accuracy on MNIST, CIFAR-10 and TinyImageNet datasets with various network architectures.
This paper presents a spectral scheme for the numerical solution of nonlinear conservation laws in non-periodic domains under arbitrary boundary conditions. The approach relies on the use of the Fourier Continuation (FC) method for spectral representation of non-periodic functions in conjunction with smooth localized artificial viscosity assignments produced by means of a Shock-Detecting Neural Network (SDNN). Like previous shock capturing schemes and artificial viscosity techniques, the combined FC-SDNN strategy effectively controls spurious oscillations in the proximity of discontinuities. Thanks to its use of a localized but smooth artificial viscosity term, whose support is restricted to a vicinity of flow-discontinuity points, the algorithm enjoys spectral accuracy and low dissipation away from flow discontinuities, and, in such regions, it produces smooth numerical solutions -- as evidenced by an essential absence of spurious oscillations in level set lines. The FC-SDNN viscosity assignment, which does not require use of problem-dependent algorithmic parameters, induces a significantly lower overall dissipation than other methods, including the Fourier-spectral versions of the previous entropy viscosity method. The character of the proposed algorithm is illustrated with a variety of numerical results for the linear advection, Burgers and Euler equations in one and two-dimensional non-periodic spatial domains.
The function-on-function linear regression model in which the response and predictors consist of random curves has become a general framework to investigate the relationship between the functional response and functional predictors. Existing methods to estimate the model parameters may be sensitive to outlying observations, common in empirical applications. In addition, these methods may be severely affected by such observations, leading to undesirable estimation and prediction results. A robust estimation method, based on iteratively reweighted simple partial least squares, is introduced to improve the prediction accuracy of the function-on-function linear regression model in the presence of outliers. The performance of the proposed method is based on the number of partial least squares components used to estimate the function-on-function linear regression model. Thus, the optimum number of components is determined via a data-driven error criterion. The finite-sample performance of the proposed method is investigated via several Monte Carlo experiments and an empirical data analysis. In addition, a nonparametric bootstrap method is applied to construct pointwise prediction intervals for the response function. The results are compared with some of the existing methods to illustrate the improvement potentially gained by the proposed method.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.