We introduce a new regularization model for incompressible fluid flow, which is a regularization of the EMAC formulation of the Navier-Stokes equations (NSE) that we call EMAC-Reg. The EMAC (energy, momentum, and angular momentum conserving) formulation has proved to be a useful formulation because it conserves energy, momentum and angular momentum even when the divergence constraint is only weakly enforced. However it is still a NSE formulation and so cannot resolve higher Reynolds number flows without very fine meshes. By carefully introducing regularization into the EMAC formulation, we create a model more suitable for coarser mesh computations but that still conserves the same quantities as EMAC, i.e., energy, momentum, and angular momentum. We show that EMAC-Reg, when semi-discretized with a finite element spatial discretization is well-posed and optimally accurate. Numerical results are provided that show EMAC-Reg is a robust coarse mesh model.
The principle of least action is one of the most fundamental physical principle. It says that among all possible motions connecting two points in a phase space, the system will exhibit those motions which extremise an action functional. Many qualitative features of dynamical systems, such as the presence of conservation laws and energy balance equations, are related to the existence of an action functional. Incorporating variational structure into learning algorithms for dynamical systems is, therefore, crucial in order to make sure that the learned model shares important features with the exact physical system. In this paper we show how to incorporate variational principles into trajectory predictions of learned dynamical systems. The novelty of this work is that (1) our technique relies only on discrete position data of observed trajectories. Velocities or conjugate momenta do {\em not} need to be observed or approximated and {\em no} prior knowledge about the form of the variational principle is assumed. Instead, they are recovered using backward error analysis. (2) Moreover, our technique compensates discretisation errors when trajectories are computed from the learned system. This is important when moderate to large step-sizes are used and high accuracy is required. For this, we introduce and rigorously analyse the concept of inverse modified Lagrangians by developing an inverse version of variational backward error analysis. (3) Finally, we introduce a method to perform system identification from position observations only, based on variational backward error analysis.
We show that a specific skew-symmetric form of hyperbolic problems leads to energy conservation and an energy bound. Next, the compressible Euler equations is transformed to this skew-symmetric form and it is explained how to obtain an energy estimate. Finally we show that the new formulation lead to energy stable and energy conserving discrete approximations if the scheme is formulated on summation-by-parts form.
The study on the implicit regularization induced by gradient-based optimization is a longstanding pursuit. In the present paper, we characterize the implicit regularization of momentum gradient descent (MGD) with early stopping by comparing with the explicit $\ell_2$-regularization (ridge). In details, we study MGD in the continuous-time view, so-called momentum gradient flow (MGF), and show that its tendency is closer to ridge than the gradient descent (GD) [Ali et al., 2019] for least squares regression. Moreover, we prove that, under the calibration $t=\sqrt{2/\lambda}$, where $t$ is the time parameter in MGF and $\lambda$ is the tuning parameter in ridge regression, the risk of MGF is no more than 1.54 times that of ridge. In particular, the relative Bayes risk of MGF to ridge is between 1 and 1.035 under the optimal tuning. The numerical experiments support our theoretical results strongly.
High-order implicit shock tracking is a new class of numerical methods to approximate solutions of conservation laws with non-smooth features. These methods align elements of the computational mesh with non-smooth features to represent them perfectly, allowing high-order basis functions to approximate smooth regions of the solution without the need for nonlinear stabilization, which leads to accurate approximations on traditionally coarse meshes. The hallmark of these methods is the underlying optimization formulation whose solution is a feature-aligned mesh and the corresponding high-order approximation to the flow; the key challenge is robustly solving the central optimization problem. In this work, we develop a robust optimization solver for high-order implicit shock tracking methods so they can be reliably used to simulate complex, high-speed, compressible flows in multiple dimensions. The proposed method integrates practical robustness measures into a sequential quadratic programming method, including dimension- and order-independent simplex element collapses, mesh smoothing, and element-wise solution re-initialization, which prove to be necessary to reliably track complex discontinuity surfaces, such as curved and reflecting shocks, shock formation, and shock-shock interaction. A series of nine numerical experiments -- including two- and three-dimensional compressible flows with complex discontinuity surfaces -- are used to demonstrate: 1) the robustness of the solver, 2) the meshes produced are high-quality and track continuous, non-smooth features in addition to discontinuities, 3) the method achieves the optimal convergence rate of the underlying discretization even for flows containing discontinuities, and 4) the method produces highly accurate solutions on extremely coarse meshes relative to approaches based on shock capturing.
In this article we develop the Constraint Energy Minimizing Generalized Multiscale Finite Element Method (CEM-GMsFEM) for elliptic partial differential equations with inhomogeneous Dirichlet, Neumann, and Robin boundary conditions, and the high contrast property emerges from the coefficients of elliptic operators and Robin boundary conditions. By careful construction of multiscale bases of the CEM-GMsFEM, we introduce two operators $\mathcal{D}^m$ and $\mathcal{N}^m$ which are used to handle inhomogeneous Dirichlet and Neumann boundary values and are also proved to converge independently of contrast ratios as enlarging oversampling regions. We provide a priori error estimate and show that oversampling layers are the key factor in controlling numerical errors. A series of experiments are conducted, and those results reflect the reliability of our methods even with high contrast ratios.
In this paper we propose an accurate, and computationally efficient method for incorporating adaptive spatial resolution into weakly-compressible Smoothed Particle Hydrodynamics (SPH) schemes. Particles are adaptively split and merged in an accurate manner. Critically, the method ensures that the number of neighbors of each particle is optimal, leading to an efficient algorithm. A set of background particles is used to specify either geometry-based spatial resolution, where the resolution is a function of distance to a solid body, or solution-based adaptive resolution, where the resolution is a function of the computed solution. This allows us to simulate problems using particles having length variations of the order of 1:250 with much fewer particles than currently reported with other techniques. The method is designed to automatically adapt when any solid bodies move. The algorithms employed are fully parallel. We consider a suite of benchmark problems to demonstrate the accuracy of the approach. We then consider the classic problem of the flow past a circular cylinder at a range of Reynolds numbers and show that the proposed method produces accurate results with a significantly reduced number of particles. We provide an open source implementation and a fully reproducible manuscript.
We are interested in building schemes for the compressible Euler equations that are also locally conserving the angular momentum. We present a general framework, describe a few examples of schemes and show results. These schemes can be of arbitrary order.
In many important graph data processing applications the acquired information includes both node features and observations of the graph topology. Graph neural networks (GNNs) are designed to exploit both sources of evidence but they do not optimally trade-off their utility and integrate them in a manner that is also universal. Here, universality refers to independence on homophily or heterophily graph assumptions. We address these issues by introducing a new Generalized PageRank (GPR) GNN architecture that adaptively learns the GPR weights so as to jointly optimize node feature and topological information extraction, regardless of the extent to which the node labels are homophilic or heterophilic. Learned GPR weights automatically adjust to the node label pattern, irrelevant on the type of initialization, and thereby guarantee excellent learning performance for label patterns that are usually hard to handle. Furthermore, they allow one to avoid feature over-smoothing, a process which renders feature information nondiscriminative, without requiring the network to be shallow. Our accompanying theoretical analysis of the GPR-GNN method is facilitated by novel synthetic benchmark datasets generated by the so-called contextual stochastic block model. We also compare the performance of our GNN architecture with that of several state-of-the-art GNNs on the problem of node-classification, using well-known benchmark homophilic and heterophilic datasets. The results demonstrate that GPR-GNN offers significant performance improvement compared to existing techniques on both synthetic and benchmark data.
Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.
Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a method to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.