In this work a non-conservative balance law formulation is considered that encompasses the rotating, compressible Euler equations for dry atmospheric flows. We develop a semi-discretely entropy stable discontinuous Galerkin method on curvilinear meshes using a generalization of flux differencing for numerical fluxes in fluctuation form. The method uses the skew-hybridized formulation of the element operators to ensure that, even in the presence of under-integration on curvilinear meshes, the resulting discretization is entropy stable. Several atmospheric flow test cases in one, two, and three dimensions confirm the theoretical entropy stability results as well as show the high-order accuracy and robustness of the method.
Rate of convergence results are presented for a new class of explicit Euler schemes, which approximate stochastic differential equations (SDEs) with superlinearly growing drift coefficients that satisfy a particular form of strong monotonicity. The new, distinct feature of this class of explicit schemes is the preservation of the monotonicity condition for the new, suitably controlled drift coefficients that guaranty the finiteness of moments of the numerical solutions up to a desired order.
In this paper, discrete linear quadratic regulator (DLQR) and iterative linear quadratic regulator (ILQR) methods based on high-order Runge-Kutta (RK) discretization are proposed for solving linear and nonlinear quadratic optimal control problems respectively. As discovered in [W. Hager, Runge-Kutta method in optimal control and the discrete adjoint system, Numer. Math.,2000, pp. 247-282], direct approach with RK discretization is equivalent with indirect approach based on symplectic partitioned Runge-Kutta (SPRK) integration. In this paper, we will reconstruct this equivalence by the analogue of continuous and discrete dynamic programming. Then, based on the equivalence, we discuss the issue that the internal-stage controls produced by direct approach may have lower order accuracy than the RK method used. We propose order conditions for internal-stage controls and then demonstrate that third or fourth order explicit RK discretization cannot avoid the order reduction phenomenon. To overcome this obstacle, we calculate node control instead of internal-stage controls in DLQR and ILQR methods. And numerical examples will illustrate the validity of our methods. Another advantage of our methods is high computational efficiency which comes from the usage of feedback technique. In this paper, we also demonstrate that ILQR is essentially a quasi-Newton method with linear convergence rate.
We study a fourth-order div problem and its approximation by the discontinuous Petrov-Galerkin method with optimal test functions. We present two variants, based on first and second-order systems. In both cases we prove well-posedness of the formulation and quasi-optimal convergence of the approximation. Our analysis includes the fully-discrete schemes with approximated test functions, for general dimension and polynomial degree in the first-order case, and for two dimensions and lowest-order approximation in the second-order case. Numerical results illustrate the performance for quasi-uniform and adaptively refined meshes.
The recent notion of graded modal types provides a framework for extending type theories with fine-grained data-flow reasoning. The Granule language explores this idea in the context of linear types. In this practical setting, we observe that the presence of graded modal types can introduce an additional impediment when programming: when composing programs, it is often necessary to 'distribute' data types over graded modalities, and vice versa. In this paper, we show how to automatically derive these distributive laws as combinators for programming. We discuss the implementation and use of this automated deriving procedure in Granule, providing easy access to these distributive combinators. This work is also applicable to Linear Haskell (which retrofits Haskell with linear types via grading) and we apply our technique there to provide the same automatically derived combinators. Along the way, we discuss interesting considerations for pattern matching analysis via graded linear types. Lastly, we show how other useful structural combinators can also be automatically derived.
In this paper, we study deep neural networks (DNNs) for solving high-dimensional evolution equations with oscillatory solutions. Different from deep least-squares methods that deal with time and space variables simultaneously, we propose a deep adaptive basis Galerkin (DABG) method which employs the spectral-Galerkin method for time variable by tensor-product basis for oscillatory solutions and the deep neural network method for high-dimensional space variables. The proposed method can lead to a linear system of differential equations having unknown DNNs that can be trained via the loss function. We establish a posterior estimates of the solution error which is bounded by the minimal loss function and the term $O(N^{-m})$, where $N$ is the number of basis functions and $m$ characterizes the regularity of the equation, and show that if the true solution is a Barron-type function, the error bound converges to zero as $M=O(N^p)$ approaches to infinity where $M$ is the width of the used networks and $p$ is a positive constant. Numerical examples including high-dimensional linear parabolic and hyperbolic equations, and nonlinear Allen-Cahn equation are presented to demonstrate the performance of the proposed DABG method is better than that of existing DNNs.
We give an exact characterization of admissibility in statistical decision problems in terms of Bayes optimality in a so-called nonstandard extension of the original decision problem, as introduced by Duanmu and Roy. Unlike the consideration of improper priors or other generalized notions of Bayes optimalitiy, the nonstandard extension is distinguished, in part, by having priors that can assign "infinitesimal" mass in a sense that can be made rigorous using results from nonstandard analysis. With these additional priors, we find that, informally speaking, a decision procedure $\delta_0$ is admissible in the original statistical decision problem if and only if, in the nonstandard extension of the problem, the nonstandard extension of $\delta_0$ is Bayes optimal among the extensions of standard decision procedures with respect to a nonstandard prior that assigns at least infinitesimal mass to every standard parameter value. We use the above theorem to give further characterizations of admissibility, one related to Blyth's method, one to a condition due to Stein which characterizes admissibility under some regularity assumptions; and finally, a characterization using finitely additive priors in decision problems meeting certain regularity requirements. Our results imply that Blyth's method is a sound and complete method for establishing admissibility. Buoyed by this result, we revisit the univariate two-sample common-mean problem, and show that the Graybill--Deal estimator is admissible among a certain class of unbiased decision procedures.
The Ensemble Kalman Filter (EnKF) belongs to the class of iterative particle filtering methods and can be used for solving control--to--observable inverse problems. In this context, the EnKF is known as Ensemble Kalman Inversion (EKI). In recent years several continuous limits in the number of iteration and particles have been performed in order to study properties of the method. In particular, a one--dimensional linear stability analysis reveals possible drawbacks in the phase space of moments provided by the continuous limits of the EKI, but observed also in the multi--dimensional setting. In this work we address this issue by introducing a stabilization of the dynamics which leads to a method with globally asymptotically stable solutions. We illustrate the performance of the stabilized version by using test inverse problems from the literature and comparing it with the classical continuous limit formulation of the method.
Markov categories, having tensors with copying and discarding, provide a setting for categorical probability. This paper uses finite colimits and what we call uniform states in such Markov categories to define a (fixed size) multiset functor, with basic operations for sums and zips of multisets, and a graded monad structure. Multisets can be used to represent both urns filled with coloured balls and also draws of multiple balls from such urns. The main contribution of this paper is the abstract definition of multinomial and hypergeometric distributions on multisets, as draws. It is shown that these operations interact appropriately with various operations on multisets.
Exploration-exploitation is a powerful and practical tool in multi-agent learning (MAL), however, its effects are far from understood. To make progress in this direction, we study a smooth analogue of Q-learning. We start by showing that our learning model has strong theoretical justification as an optimal model for studying exploration-exploitation. Specifically, we prove that smooth Q-learning has bounded regret in arbitrary games for a cost model that explicitly captures the balance between game and exploration costs and that it always converges to the set of quantal-response equilibria (QRE), the standard solution concept for games under bounded rationality, in weighted potential games with heterogeneous learning agents. In our main task, we then turn to measure the effect of exploration in collective system performance. We characterize the geometry of the QRE surface in low-dimensional MAL systems and link our findings with catastrophe (bifurcation) theory. In particular, as the exploration hyperparameter evolves over-time, the system undergoes phase transitions where the number and stability of equilibria can change radically given an infinitesimal change to the exploration parameter. Based on this, we provide a formal theoretical treatment of how tuning the exploration parameter can provably lead to equilibrium selection with both positive as well as negative (and potentially unbounded) effects to system performance.
Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.