We propose an accurate data-driven numerical scheme to solve Stochastic Differential Equations (SDEs), by taking large time steps. The SDE discretization is built up by means of a polynomial chaos expansion method, on the basis of accurately determined stochastic collocation (SC) points. By employing an artificial neural network to learn these SC points, we can perform Monte Carlo simulations with large time steps. Error analysis confirms that this data-driven scheme results in accurate SDE solutions in the sense of strong convergence, provided the learning methodology is robust and accurate. With a method variant called the compression-decompression collocation and interpolation technique, we can drastically reduce the number of neural network functions that have to be learned, so that computational speed is enhanced. Numerical experiments confirm a high-quality strong convergence error when using large time steps, and the novel scheme outperforms some classical numerical SDE discretizations. Some applications, here in financial option valuation, are also presented.
Optimization of parameterized quantum circuits is indispensable for applications of near-term quantum devices to computational tasks with variational quantum algorithms (VQAs). However, the existing optimization algorithms for VQAs require an excessive number of quantum-measurement shots in estimating expectation values of observables or iterating updates of circuit parameters, whose cost has been a crucial obstacle for practical use. To address this problem, we develop an efficient framework, \textit{stochastic gradient line Bayesian optimization} (SGLBO), for the circuit optimization with fewer measurement shots. The SGLBO reduces the cost of measurement shots by estimating an appropriate direction of updating the parameters based on stochastic gradient descent (SGD) and further by utilizing Bayesian optimization (BO) to estimate the optimal step size in each iteration of the SGD. We formulate an adaptive measurement-shot strategy to achieve the optimization feasibly without relying on precise expectation-value estimation and many iterations; moreover, we show that a technique of suffix averaging can significantly reduce the effect of statistical and hardware noise in the optimization for the VQAs. Our numerical simulation demonstrates that the SGLBO augmented with these techniques can drastically reduce the required number of measurement shots, improve the accuracy in the optimization, and enhance the robustness against the noise compared to other state-of-art optimizers in representative tasks for the VQAs. These results establish a framework of quantum-circuit optimizers integrating two different optimization approaches, SGD and BO, to reduce the cost of measurement shots significantly.
A second order accurate, linear numerical method is analyzed for the Landau-Lifshitz equation with large damping parameters. This equation describes the dynamics of magnetization, with a non-convexity constraint of unit length of the magnetization. The numerical method is based on the second-order backward differentiation formula in time, combined with an implicit treatment of the linear diffusion term and explicit extrapolation for the nonlinear terms. Afterward, a projection step is applied to normalize the numerical solution at a point-wise level. This numerical scheme has shown extensive advantages in the practical computations for the physical model with large damping parameters, which comes from the fact that only a linear system with constant coefficients (independent of both time and the updated magnetization) needs to be solved at each time step, and has greatly improved the numerical efficiency. Meanwhile, a theoretical analysis for this linear numerical scheme has not been available. In this paper, we provide a rigorous error estimate of the numerical scheme, in the discrete $\ell^{\infty}(0,T; \ell^2) \cap \ell^2(0,T; H_h^1)$ norm, under suitable regularity assumptions and reasonable ratio between the time step-size and the spatial mesh-size. In particular, the projection operation is nonlinear, and a stability estimate for the projection step turns out to be highly challenging. Such a stability estimate is derived in details, which will play an essential role in the convergence analysis for the numerical scheme, if the damping parameter is greater than 3.
We introduce a new numerical method, based on Bernoulli polynomials, for solving multiterm variable-order fractional differential equations. The variable-order fractional derivative was considered in the Caputo sense, while the Riemann--Liouville integral operator was used to give approximations for the unknown function and its variable-order derivatives. An operational matrix of variable-order fractional integration was introduced for the Bernoulli functions. By assuming that the solution of the problem is sufficiently smooth, we approximated a given order of its derivative using Bernoulli polynomials. Then, we used the introduced operational matrix to find some approximations for the unknown function and its derivatives. Using these approximations and some collocation points, the problem was reduced to the solution of a system of nonlinear algebraic equations. An error estimate is given for the approximate solution obtained by the proposed method. Finally, five illustrative examples were considered to demonstrate the applicability and high accuracy of the proposed technique, comparing our results with the ones obtained by existing methods in the literature and making clear the novelty of the work. The numerical results showed that the new method is efficient, giving high-accuracy approximate solutions even with a small number of basis functions and when the solution to the problem is not infinitely differentiable, providing better results and a smaller number of basis functions when compared to state-of-the-art methods.
The kinetic theory provides a good basis for developing numerical methods for multiscale gas flows covering a wide range of flow regimes. A particular challenge for kinetic schemes is whether they can capture the correct hydrodynamic behaviors of the system in the continuum regime (i.e., as the Knudsen number $\epsilon\ll 1$ ) without enforcing kinetic scale resolution. At the current stage, the main approach to analyze such property is the asymptotic preserving (AP) concept, which aims to show whether the kinetic scheme reduces to a solver for the hydrodynamic equations as $\epsilon \to 0$. However, the detailed asymptotic properties of the kinetic scheme are indistinguishable as $\epsilon$ is small but finite under the AP framework. In order to distinguish different characteristics of kinetic schemes, in this paper we introduce the concept of unified preserving (UP) aiming at assessing asmyptotic orders (in terms of $\epsilon$) of a kinetic scheme by employing the modified equation approach and Chapman-Enskon analysis. It is shown that the UP properties of a kinetic scheme generally depend on the spatial/temporal accuracy and closely on the inter-connections among the three scales (kinetic scale, numerical scale, and hydrodynamic scale). Specifically, the numerical resolution and specific discretization determine the numerical flow behaviors of the scheme in different regimes, especially in the near continuum limit. As two examples, the UP analysis is applied to the discrete unified gas-kinetic scheme (DUGKS) and a second-order implicit-explicit Runge-Kutta (IMEX-RK) scheme to evaluate their asymptotic behaviors in the continuum limit.
Soft robots are made of compliant and deformable materials and can perform tasks challenging for conventional rigid robots. The inherent compliance of soft robots makes them more suitable and adaptable for interactions with humans and the environment. However, this preeminence comes at a cost: their continuum nature makes it challenging to develop robust model-based control strategies. Specifically, an adaptive control approach addressing this challenge has not yet been applied to physical soft robotic arms. This work presents a reformulation of dynamics for a soft continuum manipulator using the Euler-Lagrange method. The proposed model eliminates the simplifying assumption made in previous works and provides a more accurate description of the robot's inertia. Based on our model, we introduce a task-space adaptive control scheme. This controller is robust against model parameter uncertainties and unknown input disturbances. The controller is implemented on a physical soft continuum arm. A series of experiments were carried out to validate the effectiveness of the controller in task-space trajectory tracking under different payloads. The controller outperforms the state-of-the-art method both in terms of accuracy and robustness. Moreover, the proposed model-based control design is flexible and can be generalized to any continuum robotic arm with an arbitrary number of continuum segments.
We introduce a new regularization model for incompressible fluid flow, which is a regularization of the EMAC formulation of the Navier-Stokes equations (NSE) that we call EMAC-Reg. The EMAC (energy, momentum, and angular momentum conserving) formulation has proved to be a useful formulation because it conserves energy, momentum and angular momentum even when the divergence constraint is only weakly enforced. However it is still a NSE formulation and so cannot resolve higher Reynolds number flows without very fine meshes. By carefully introducing regularization into the EMAC formulation, we create a model more suitable for coarser mesh computations but that still conserves the same quantities as EMAC, i.e., energy, momentum, and angular momentum. We show that EMAC-Reg, when semi-discretized with a finite element spatial discretization is well-posed and optimally accurate. Numerical results are provided that show EMAC-Reg is a robust coarse mesh model.
Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.
A core capability of intelligent systems is the ability to quickly learn new tasks by drawing on prior experience. Gradient (or optimization) based meta-learning has recently emerged as an effective approach for few-shot learning. In this formulation, meta-parameters are learned in the outer loop, while task-specific models are learned in the inner-loop, by using only a small amount of data from the current task. A key challenge in scaling these approaches is the need to differentiate through the inner loop learning process, which can impose considerable computational and memory burdens. By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer. This effectively decouples the meta-gradient computation from the choice of inner loop optimizer. As a result, our approach is agnostic to the choice of inner loop optimizer and can gracefully handle many gradient steps without vanishing gradients or memory constraints. Theoretically, we prove that implicit MAML can compute accurate meta-gradients with a memory footprint that is, up to small constant factors, no more than that which is required to compute a single inner loop gradient and at no overall increase in the total computational cost. Experimentally, we show that these benefits of implicit MAML translate into empirical gains on few-shot image recognition benchmarks.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
Asynchronous momentum stochastic gradient descent algorithms (Async-MSGD) is one of the most popular algorithms in distributed machine learning. However, its convergence properties for these complicated nonconvex problems is still largely unknown, because of the current technical limit. Therefore, in this paper, we propose to analyze the algorithm through a simpler but nontrivial nonconvex problem - streaming PCA, which helps us to understand Aync-MSGD better even for more general problems. Specifically, we establish the asymptotic rate of convergence of Async-MSGD for streaming PCA by diffusion approximation. Our results indicate a fundamental tradeoff between asynchrony and momentum: To ensure convergence and acceleration through asynchrony, we have to reduce the momentum (compared with Sync-MSGD). To the best of our knowledge, this is the first theoretical attempt on understanding Async-MSGD for distributed nonconvex stochastic optimization. Numerical experiments on both streaming PCA and training deep neural networks are provided to support our findings for Async-MSGD.