We develop in this work the first polytopal complexes of differential forms. These complexes, inspired by the Discrete De Rham and the Virtual Element approaches, are discrete versions of the de Rham complex of differential forms built on meshes made of general polytopal elements. Both constructions benefit from the high-level approach of polytopal methods, which leads, on certain meshes, to leaner constructions than the finite element method. We establish commutation properties between the interpolators and the discrete and continuous exterior derivatives, prove key polynomial consistency results for the complexes, and show that their cohomologies are isomorphic to the cohomology of the continuous de Rham complex.
We consider the problem of learning functions in the $\mathcal{F}_{p,\pi}$ and Barron spaces, which are natural function spaces that arise in the high-dimensional analysis of random feature models (RFMs) and two-layer neural networks. Through a duality analysis, we reveal that the approximation and estimation of these spaces can be considered equivalent in a certain sense. This enables us to focus on the easier problem of approximation and estimation when studying the generalization of both models. The dual equivalence is established by defining an information-based complexity that can effectively control estimation errors. Additionally, we demonstrate the flexibility of our duality framework through comprehensive analyses of two concrete applications. The first application is to study learning functions in $\mathcal{F}_{p,\pi}$ with RFMs. We prove that the learning does not suffer from the curse of dimensionality as long as $p>1$, implying RFMs can work beyond the kernel regime. Our analysis extends existing results [CMM21] to the noisy case and removes the requirement of overparameterization. The second application is to investigate the learnability of reproducing kernel Hilbert space (RKHS) under the $L^\infty$ metric. We derive both lower and upper bounds of the minimax estimation error by using the spectrum of the associated kernel. We then apply these bounds to dot-product kernels and analyze how they scale with the input dimension. Our results suggest that learning with ReLU (random) features is generally intractable in terms of reaching high uniform accuracy.
Stochastic dual dynamic programming is a cutting plane type algorithm for multi-stage stochastic optimization originated about 30 years ago. In spite of its popularity in practice, there does not exist any analysis on the convergence rates of this method. In this paper, we first establish the number of iterations, i.e., iteration complexity, required by a basic dynamic cutting plane method for solving relatively simple multi-stage optimization problems, by introducing novel mathematical tools including the saturation of search points. We then refine these basic tools and establish the iteration complexity for both deterministic and stochastic dual dynamic programming methods for solving more general multi-stage stochastic optimization problems under the standard stage-wise independence assumption. Our results indicate that the complexity of some deterministic variants of these methods mildly increases with the number of stages $T$, in fact linearly dependent on $T$ for discounted problems. Therefore, they are efficient for strategic decision making which involves a large number of stages, but with a relatively small number of decision variables in each stage. Without explicitly discretizing the state and action spaces, these methods might also be pertinent to the related reinforcement learning and stochastic control areas.
Quantitative notions of bisimulation are well-known tools for the minimization of dynamical models such as Markov chains and ordinary differential equations (ODEs). In \emph{forward bisimulations}, each state in the quotient model represents an equivalence class and the dynamical evolution gives the overall sum of its members in the original model. Here we introduce generalized forward bisimulation (GFB) for dynamical systems over commutative monoids and develop a partition refinement algorithm to compute the coarsest one. When the monoid is $(\mathbb{R}, +)$, we recover probabilistic bisimulation for Markov chains and more recent forward bisimulations for nonlinear ODEs. Using $(\mathbb{R}, \cdot)$ we get nonlinear reductions for discrete-time dynamical systems and ODEs where each variable in the quotient model represents the product of original variables in the equivalence class. When the domain is a finite set such as the Booleans $\mathbb{B}$, we can apply GFB to Boolean networks (BN), a widely used dynamical model in computational biology. Using a prototype implementation of our minimization algorithm for GFB, we find disjunction- and conjunction-preserving reductions on 60 BN from two well-known repositories, and demonstrate the obtained analysis speed-ups. We also provide the biological interpretation of the reduction obtained for two selected BN, and we show how GFB enables the analysis of a large one that could not be analyzed otherwise. Using a randomized version of our algorithm we find product-preserving (therefore non-linear) reductions on 21 dynamical weighted networks from the literature that could not be handled by the exact algorithm.
We introduce a priori Sobolev-space error estimates for the solution of nonlinear, and possibly parametric, PDEs using Gaussian process and kernel based methods. The primary assumptions are: (1) a continuous embedding of the reproducing kernel Hilbert space of the kernel into a Sobolev space of sufficient regularity; and (2) the stability of the differential operator and the solution map of the PDE between corresponding Sobolev spaces. The proof is articulated around Sobolev norm error estimates for kernel interpolants and relies on the minimizing norm property of the solution. The error estimates demonstrate dimension-benign convergence rates if the solution space of the PDE is smooth enough. We illustrate these points with applications to high-dimensional nonlinear elliptic PDEs and parametric PDEs. Although some recent machine learning methods have been presented as breaking the curse of dimensionality in solving high-dimensional PDEs, our analysis suggests a more nuanced picture: there is a trade-off between the regularity of the solution and the presence of the curse of dimensionality. Therefore, our results are in line with the understanding that the curse is absent when the solution is regular enough.
Accurate time series forecasting is a fundamental challenge in data science. It is often affected by external covariates such as weather or human intervention, which in many applications, may be predicted with reasonable accuracy. We refer to them as predicted future covariates. However, existing methods that attempt to predict time series in an iterative manner with autoregressive models end up with exponential error accumulations. Other strategies hat consider the past and future in the encoder and decoder respectively limit themselves by dealing with the historical and future data separately. To address these limitations, a novel feature representation strategy -- shifting -- is proposed to fuse the past data and future covariates such that their interactions can be considered. To extract complex dynamics in time series, we develop a parallel deep learning framework composed of RNN and CNN, both of which are used hierarchically. We also utilize the skip connection technique to improve the model's performance. Extensive experiments on three datasets reveal the effectiveness of our method. Finally, we demonstrate the model interpretability using the Grad-CAM algorithm.
We propose a novel spectral method for the Allen--Cahn equation on spheres that does not necessarily require quadrature exactness assumptions. Instead of certain exactness degrees, we employ a restricted isometry relation based on the Marcinkiewicz--Zygmund system of quadrature rules to quantify the quadrature error of polynomial integrands. The new method imposes only some conditions on the polynomial degree of numerical solutions to derive the maximum principle and energy stability, and thus, differs substantially from existing methods in the literature that often rely on strict conditions on the time stepping size, Lipschitz property of the nonlinear term, or $L^{\infty}$ boundedness of numerical solutions. Moreover, the new method is suitable for long-time simulations because the time stepping size is independent of the diffusion coefficient in the equation. Inspired by the effective maximum principle recently proposed by Li (Ann. Appl. Math., 37(2): 131--290, 2021), we develop an almost sharp maximum principle that allows controllable deviation of numerical solutions from the sharp bound. Further, we show that the new method is energy stable and equivalent to the Galerkin method if the quadrature rule exhibits sufficient exactness degrees. In addition, we propose an energy-stable mixed-quadrature scheme which works well even with randomly sampled initial condition data. We validate the theoretical results about the energy stability and the almost sharp maximum principle by numerical experiments on the 2-sphere $\mathbb{S}^2$.
We consider a finite-dimensional vector space $W\subset K^E$ over an arbitrary field $K$ and an arbitrary set $E$. We show that the set $C(W)\subset 2^E$ consisting of the minimal supports of $W$ are the circuits of a matroid on $E$. In particular, we show that this matroid is cofinitary (hence, tame). When the cardinality of $K$ is large enough (with respect to the cardinality of $E$), then the set $trop(W)\subset 2^E$ consisting of all the supports of $W$ is a matroid itself. Afterwards we apply these results to tropical differential algebraic geometry and study the set of supports $trop(Sol(\Sigma))\subset (2^{\mathbb{N}^{m}})^n$ of spaces of formal power series solutions $\text{Sol}(\Sigma)$ of systems of linear differential equations $\Sigma$ in differential variables $x_1,\ldots,x_n$ having coefficients in the ring ${K}[\![t_1,\ldots,t_m]\!]$. If $\Sigma$ is of differential type zero, then the set $C(Sol(\Sigma))\subset (2^{\mathbb{N}^{m}})^n$ of minimal supports defines a matroid on $E=\mathbb{N}^{mn}$, and if the cardinality of $K$ is large enough, then the set of supports $trop(Sol(\Sigma))$ itself is a matroid on $E$ as well. By applying the fundamental theorem of tropical differential algebraic geometry (fttdag), we give a necessary condition under which the set of solutions $Sol(U)$ of a system $U$ of tropical linear differential equations to be a matroid. We also give a counterexample to the fttdag for systems $\Sigma$ of linear differential equations over countable fields. In this case, the set $trop(Sol(\Sigma))$ may not form a matroid.
We introduce the walk-on-boundary (WoB) method for solving boundary value problems to computer graphics. WoB is a grid-free Monte Carlo solver for certain classes of second order partial differential equations. A similar Monte Carlo solver, the walk-on-spheres (WoS) method, has been recently popularized in computer graphics due to its advantages over traditional spatial discretization-based alternatives. We show that WoB's intrinsic properties yield further advantages beyond those of WoS. Unlike WoS, WoB naturally supports various boundary conditions (Dirichlet, Neumann, Robin, and mixed) for both interior and exterior domains. WoB builds upon boundary integral formulations, and it is mathematically more similar to light transport simulation in rendering than the random walk formulation of WoS. This similarity between WoB and rendering allows us to implement WoB on top of Monte Carlo ray tracing, and to incorporate advanced rendering techniques (e.g., bidirectional estimators with multiple importance sampling, the virtual point lights method, and Markov chain Monte Carlo) into WoB. WoB does not suffer from the intrinsic bias of WoS near the boundary and can estimate solutions precisely on the boundary. Our numerical results highlight the advantages of WoB over WoS as an attractive alternative to solve boundary value problems based on Monte Carlo.
For terminal value problems of fractional differential equations of order $\alpha \in (0,1)$ that use Caputo derivatives, shooting methods are a well developed and investigated approach. Based on recently established analytic properties of such problems, we develop a new technique to select the required initial values that solves such shooting problems quickly and accurately. Numerical experiments indicate that this new proportional secting technique converges very quickly and accurately to the solution. Run time measurements indicate a speedup factor of between 4 and 10 when compared to the standard bisection method.
Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.