We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control, we learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates. For the latter, we rely on KoopmanizingFlows - a diffeomorphism-based representation of Koopman operators. With such a learned model, we can replace the nonlinear infinite-horizon optimal control problem with quadratic costs to that of a linear quadratic regulator (LQR), facilitating efficacious optimal control for nonlinear systems. The prediction and control efficacy of the proposed method is verified on simulation examples.
We introduce and analyze various Regularized Combined Field Integral Equations (CFIER) formulations of time-harmonic Navier equations in media with piece-wise constant material properties. These formulations can be derived systematically starting from suitable coercive approximations of Dirichlet-to-Neumann operators (DtN), and we present a periodic pseudodifferential calculus framework within which the well posedness of CIER formulations can be established. We also use the DtN approximations to derive and analyze Optimized Schwarz (OS) methods for the solution of elastodynamics transmission problems. The pseudodifferential calculus we develop in this paper relies on careful singularity splittings of the kernels of Navier boundary integral operators which is also the basis of high-order Nystr\"om quadratures for their discretizations. Based on these high-order discretizations we investigate the rate of convergence of iterative solvers applied to CFIER and OS formulations of scattering and transmission problems. We present a variety of numerical results that illustrate that the CFIER methodology leads to important computational savings over the classical CFIE one, whenever iterative solvers are used for the solution of the ensuing discretized boundary integral equations. Finally, we show that the OS methods are competitive in the high-frequency high-contrast regime.
This paper presents a control framework on Lie groups by designing the control objective in its Lie algebra. Control on Lie groups is challenging due to its nonlinear nature and difficulties in system parameterization. Existing methods to design the control objective on a Lie group and then derive the gradient for controller design are non-trivial and can result in slow convergence in tracking control. We show that with a proper left-invariant metric, setting the gradient of the cost function as the tracking error in the Lie algebra leads to a quadratic Lyapunov function that enables globally exponential convergence. In the PD control case, we show that our controller can maintain an exponential convergence rate even when the initial error is approaching $\pi$ in SO(3). We also show the merit of this proposed framework in trajectory optimization. The proposed cost function enables the iterative Linear Quadratic Regulator (iLQR) to converge much faster than the Differential Dynamic Programming (DDP) with a well-adopted cost function when the initial trajectory is poorly initialized on SO(3).
Safety is critical in autonomous robotic systems. A safe control law ensures forward invariance of a safe set (a subset in the state space). It has been extensively studied regarding how to derive a safe control law with a control-affine analytical dynamic model. However, in complex environments and tasks, it is challenging and time-consuming to obtain a principled analytical model of the system. In these situations, data-driven learning is extensively used and the learned models are encoded in neural networks. How to formally derive a safe control law with Neural Network Dynamic Models (NNDM) remains unclear due to the lack of computationally tractable methods to deal with these black-box functions. In fact, even finding the control that minimizes an objective for NNDM without any safety constraint is still challenging. In this work, we propose MIND-SIS (Mixed Integer for Neural network Dynamic model with Safety Index Synthesis), the first method to derive safe control laws for NNDM. The method includes two parts: 1) SIS: an algorithm for the offline synthesis of the safety index (also called as barrier function), which uses evolutionary methods and 2) MIND: an algorithm for online computation of the optimal and safe control signal, which solves a constrained optimization using a computationally efficient encoding of neural networks. It has been theoretically proved that MIND-SIS guarantees forward invariance and finite convergence. And it has been numerically validated that MIND-SIS achieves safe and optimal control of NNDM. From our experiments, the optimality gap is less than $10^{-8}$, and the safety constraint violation is $0$.
The problem of continuous inverse optimal control (over finite time horizon) is to learn the unknown cost function over the sequence of continuous control variables from expert demonstrations. In this article, we study this fundamental problem in the framework of energy-based model, where the observed expert trajectories are assumed to be random samples from a probability density function defined as the exponential of the negative cost function up to a normalizing constant. The parameters of the cost function are learned by maximum likelihood via an "analysis by synthesis" scheme, which iterates (1) synthesis step: sample the synthesized trajectories from the current probability density using the Langevin dynamics via back-propagation through time, and (2) analysis step: update the model parameters based on the statistical difference between the synthesized trajectories and the observed trajectories. Given the fact that an efficient optimization algorithm is usually available for an optimal control problem, we also consider a convenient approximation of the above learning method, where we replace the sampling in the synthesis step by optimization. Moreover, to make the sampling or optimization more efficient, we propose to train the energy-based model simultaneously with a top-down trajectory generator via cooperative learning, where the trajectory generator is used to fast initialize the synthesis step of the energy-based model. We demonstrate the proposed methods on autonomous driving tasks, and show that they can learn suitable cost functions for optimal control.
On-demand delivery has become increasingly popular around the world. Motivated by a large grocery chain store who offers fast on-demand delivery services, we model and solve a stochastic dynamic driver dispatching and routing problem for last-mile delivery systems where on-time performance is the main target. The system operator needs to dispatch a set of drivers and specify their delivery routes facing random demand that arrives over a fixed number of periods. The resulting stochastic dynamic program is challenging to solve due to the curse of dimensionality. We propose a novel structured approximation framework to approximate the value function via a parametrized dispatching and routing policy. We analyze the structural properties of the approximation framework and establish its performance guarantee under large-demand scenarios. We then develop efficient exact algorithms for the approximation problem based on Benders decomposition and column generation, which deliver verifiably optimal solutions within minutes. The evaluation results on a real-world data set show that our framework outperforms the current policy of the company by 36.53% on average in terms of delivery time. We also perform several policy experiments to understand the value of dynamic dispatching and routing with varying fleet sizes and dispatch frequencies.
Collision avoidance is a widely investigated topic in robotic applications. When applying collision avoidance techniques to a mobile robot, how to deal with the spatial structure of the robot still remains a challenge. In this paper, we design a configuration-aware safe control law by solving a Quadratic Programming (QP) with designed Control Barrier Functions (CBFs) constraints, which can safely navigate a mobile robotic arm to a desired region while avoiding collision with environmental obstacles. The advantage of our approach is that it correctly and in an elegant way incorporates the spatial structure of the mobile robotic arm. This is achieved by merging geometric restrictions among mobile robotic arm links into CBFs constraints. Simulations on a rigid rod and the modeled mobile robotic arm are performed to verify the feasibility and time-efficiency of proposed method. Numerical results about the time consuming for different degrees of freedom illustrate that our method scales well with dimension.
The Koopman operator is beneficial for analyzing nonlinear and stochastic dynamics; it is linear but infinite-dimensional, and it governs the evolution of observables. The extended dynamic mode decomposition (EDMD) is one of the famous methods in the Koopman operator approach. The EDMD employs a data set of snapshot pairs and a specific dictionary to evaluate an approximation for the Koopman operator, i.e., the Koopman matrix. In this study, we focus on stochastic differential equations, and a method to obtain the Koopman matrix is proposed. The proposed method does not need any data set, which employs the original system equations to evaluate some of the targeted elements of the Koopman matrix. The proposed method comprises combinatorics, an approximation of the resolvent, and extrapolations. Comparisons with the EDMD are performed for a noisy van der Pol system. The proposed method yields reasonable results even in cases wherein the EDMD exhibits a slow convergence behavior.
While deep neural networks (DNNs) have strengthened the performance of cooperative multi-agent reinforcement learning (c-MARL), the agent policy can be easily perturbed by adversarial examples. Considering the safety critical applications of c-MARL, such as traffic management, power management and unmanned aerial vehicle control, it is crucial to test the robustness of c-MARL algorithm before it was deployed in reality. Existing adversarial attacks for MARL could be used for testing, but is limited to one robustness aspects (e.g., reward, state, action), while c-MARL model could be attacked from any aspect. To overcome the challenge, we propose MARLSafe, the first robustness testing framework for c-MARL algorithms. First, motivated by Markov Decision Process (MDP), MARLSafe consider the robustness of c-MARL algorithms comprehensively from three aspects, namely state robustness, action robustness and reward robustness. Any c-MARL algorithm must simultaneously satisfy these robustness aspects to be considered secure. Second, due to the scarceness of c-MARL attack, we propose c-MARL attacks as robustness testing algorithms from multiple aspects. Experiments on \textit{SMAC} environment reveals that many state-of-the-art c-MARL algorithms are of low robustness in all aspect, pointing out the urgent need to test and enhance robustness of c-MARL algorithms.
We introduce a novel methodology for particle filtering in dynamical systems where the evolution of the signal of interest is described by a SDE and observations are collected instantaneously at prescribed time instants. The new approach includes the discretisation of the SDE and the design of efficient particle filters for the resulting discrete-time state-space model. The discretisation scheme converges with weak order 1 and it is devised to create a sequential dependence structure along the coordinates of the discrete-time state vector. We introduce a class of space-sequential particle filters that exploits this structure to improve performance when the system dimension is large. This is numerically illustrated by a set of computer simulations for a stochastic Lorenz 96 system with additive noise. The new space-sequential particle filters attain approximately constant estimation errors as the dimension of the Lorenz 96 system is increased, with a computational cost that increases polynomially, rather than exponentially, with the system dimension. Besides the new numerical scheme and particle filters, we provide in this paper a general framework for discrete-time filtering in continuous-time dynamical systems described by a SDE and instantaneous observations. Provided that the SDE is discretised using a weakly-convergent scheme, we prove that the marginal posterior laws of the resulting discrete-time state-space model converge to the posterior marginal posterior laws of the original continuous-time state-space model under a suitably defined metric. This result is general and not restricted to the numerical scheme or particle filters specifically studied in this manuscript.
While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the \textsc{LeadingOnes} and \textsc{Jump} benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $\sigma$ into another one $\tau$, but also the precise cycle structure of $\sigma \tau^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $\Theta(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{\Theta(m)}$ on jump functions with jump size~$m$.%