This work is devoted to find the numerical solutions of several one dimensional second-order ordinary differential equations. In a heuristic way, in such equations the quadratic logistic maps regarded as a local function are inserted within the nonlinear coefficient of the function as well as within the independent term. We apply the Numerov algorithm to solve these equations and we discuss the role of the initial conditions of the logistic maps in such solutions.
We present substantially generalized and improved quantum algorithms over prior work for inhomogeneous linear and nonlinear ordinary differential equations (ODE). In Berry et al., (2017), a quantum algorithm for a certain class of linear ODEs is given, where the matrix involved needs to be diagonalizable. The quantum algorithm for linear ODEs presented here extends to many classes of non-diagonalizable matrices. The algorithm here can also be exponentially faster for certain classes of diagonalizable matrices. Our linear ODE algorithm is then applied to nonlinear differential equations using Carleman linearization (an approach taken recently by us in Liu et al., (2021)). The improvement over that result is two-fold. First, we obtain an exponentially better dependence on error. This kind of logarithmic dependence on error has also been achieved by Xue et al., (2021), but only for homogeneous nonlinear equations. Second, the present algorithm can handle any sparse, invertible matrix (that models dissipation) if it has a negative log-norm (including non-diagonalizable matrices), whereas Liu et al., (2021) and Xue et al., (2021) additionally require normality.
In deep learning with differential privacy (DP), the neural network achieves the privacy usually at the cost of slower convergence (and thus lower performance) than its non-private counterpart. This work gives the first convergence analysis of the DP deep learning, through the lens of training dynamics and the neural tangent kernel (NTK). Our convergence theory successfully characterizes the effects of two key components in the DP training: the per-sample clipping and the noise addition. Our analysis not only initiates a general principled framework to understand the DP deep learning with any network architecture and loss function, but also motivates a new clipping method -- the global clipping, that significantly improves the convergence, as well as preserves the same DP guarantee and computational efficiency as the existing method, which we term as local clipping. Theoretically speaking, we precisely characterize the effect of per-sample clipping on the NTK matrix and show that the noise level of DP optimizers does not affect the convergence in the gradient flow regime. In particular, the local clipping almost certainly breaks the positive semi-definiteness of NTK, which can be preserved by our global clipping. Consequently, DP gradient descent (GD) with global clipping converge monotonically to zero loss, which is often violated by the existing DP-GD. Notably, our analysis framework easily extends to other optimizers, e.g., DP-Adam. We demonstrate through numerous experiments that DP optimizers equipped with global clipping perform strongly on classification and regression tasks. In addition, our global clipping is surprisingly effective at learning calibrated classifiers, in contrast to the existing DP classifiers which are oftentimes over-confident and unreliable. Implementation-wise, the new clipping can be realized by inserting one line of code into the Pytorch Opacus library.
In distributional reinforcement learning not only expected returns but the complete return distributions of a policy is taken into account. The return distribution for a fixed policy is given as the fixed point of an associated distributional Bellman operator. In this note we consider general distributional Bellman operators and study existence and uniqueness of its fixed points as well as their tail properties. We give necessary and sufficient conditions for existence and uniqueness of return distributions and identify cases of regular variation. We link distributional Bellman equations to multivariate distributional equations of the form $\textbf{X} =_d \textbf{A}\textbf{X} + \textbf{B}$, where $\textbf{X}$ and $\textbf{B}$ are $d$-dimensional random vectors, $\textbf{A}$ a random $d\times d$ matrix and $\textbf{X}$ and $(\textbf{A},\textbf{B})$ are independent. We show that any fixed-point of a distributional Bellman operator can be obtained as the vector of marginal laws of a solution to such a multivariate distributional equation. This makes the general theory of such equations applicable to the distributional reinforcement learning setting.
Partial differential equations (PDEs) with spatially-varying coefficients arise throughout science and engineering, modeling rich heterogeneous material behavior. Yet conventional PDE solvers struggle with the immense complexity found in nature, since they must first discretize the problem -- leading to spatial aliasing, and global meshing/sampling that is costly and error-prone. We describe a method that approximates neither the domain geometry, the problem data, nor the solution space, providing the exact solution (in expectation) even for problems with extremely detailed geometry and intricate coefficients. Our main contribution is to extend the walk on spheres (WoS) algorithm from constant- to variable-coefficient problems, by drawing on techniques from volumetric rendering. In particular, an approach inspired by null-scattering yields unbiased Monte Carlo estimators for a large class of 2nd-order elliptic PDEs, which share many attractive features with Monte Carlo rendering: no meshing, trivial parallelism, and the ability to evaluate the solution at any point without solving a global system of equations.
Deep neural networks are powerful tools for approximating functions, and they are applied to successfully solve various problems in many fields. In this paper, we propose a neural network-based numerical method to solve partial differential equations. In this new framework, the method is designed on weak formulations, and the unknown functions are approximated by deep neural networks and test functions can be chosen by different approaches, for instance, basis functions of finite element methods, neural networks, and so on. Because the spaces of trial function and test function are different, we name this new approach by Deep Petrov-Galerkin Method (DPGM). The resulted linear system is not necessarily to be symmetric and square, so the discretized problem is solved by a least-square method. Take the Poisson problem as an example, mixed DPGMs based on several mixed formulations are proposed and studied as well. In addition, we apply the DPGM to solve two classical time-dependent problems based on the space-time approach, that is, the unknown function is approximated by a neural network, in which temporal variable and spatial variables are treated equally, and the initial conditions are regarded as boundary conditions for the space-time domain. Finally, several numerical examples are presented to show the performance of the DPGMs, and we observe that this new method outperforms traditional numerical methods in several aspects.
Overdetermined systems of first kind integral equations appear in many applications. When the right-hand side is discretized, the resulting finite-data problem is ill-posed and admits infinitely many solutions. We propose a numerical method to compute the minimal-norm solution in the presence of boundary constraints. The algorithm stems from the Riesz representation theorem and operates in a reproducing kernel Hilbert space. Since the resulting linear system is strongly ill-conditioned, we construct a regularization method depending on a discrete parameter. It is based on the expansion of the minimal-norm solution in terms of the singular functions of the integral operator defining the problem. Two estimation techniques are tested for the automatic determination of the regularization parameter, namely, the discrepancy principle and the L-curve method. Numerical results concerning two artificial test problems demonstrate the excellent performance of the proposed method. Finally, a particular model typical of geophysical applications, which reproduces the readings of a frequency domain electromagnetic induction device, is investigated. The results show that the new method is extremely effective when the sought solution is smooth, but gives significant information on the solution even for non-smooth solutions.
Learning mapping between two function spaces has attracted considerable research attention. However, learning the solution operator of partial differential equations (PDEs) remains a challenge in scientific computing. Therefore, in this study, we propose a novel pseudo-differential integral operator (PDIO) inspired by a pseudo-differential operator, which is a generalization of a differential operator and characterized by a certain symbol. We parameterize the symbol by using a neural network and show that the neural-network-based symbol is contained in a smooth symbol class. Subsequently, we prove that the PDIO is a bounded linear operator, and thus is continuous in the Sobolev space. We combine the PDIO with the neural operator to develop a pseudo-differential neural operator (PDNO) to learn the nonlinear solution operator of PDEs. We experimentally validate the effectiveness of the proposed model by using Burgers' equation, Darcy flow, and the Navier-Stokes equation. The results reveal that the proposed PDNO outperforms the existing neural operator approaches in most experiments.
Cauchy problem for second-order nonlinear evolution equation is considered. This equation represents the abstract generalization of the Ball integro-differential equation. The general nonlinear case concerning terms of the equation which include a square of a norm of a gradient is considered. Three-layer semi-discrete scheme is proposed for numerical computations. In this scheme, approximation of nonlinear terms that are dependent on the gradient is done using integral averaging. Here is proved that solution of the nonlinear discrete problem and its corresponding first-order difference is uniformly bounded. For the solution of corresponding linearized discrete problem high-order, a priori estimation is obtained using two-variable Chebyshev polynomials. Based on this estimation stability of the nonlinear discrete problem is shown. For smooth solutions, it is obtained estimation for the error of the approximate solution. An approximate solution for each time step we apply the iteration method. The convergence of the iteration method is proved.
We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.