Learning optimal control policies directly on physical systems is challenging since even a single failure can lead to costly hardware damage. Most existing learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. A notable exception is the GoSafe algorithm, which, unfortunately, cannot handle high-dimensional systems and hence cannot be applied to most real-world dynamical systems. This work proposes GoSafeOpt as the first algorithm that can safely discover globally optimal policies for complex systems while giving safety and optimality guarantees. Our experiments on a robot arm that would be prohibitive for GoSafe demonstrate that GoSafeOpt safely finds remarkably better policies than competing safe learning methods for high-dimensional domains.
This paper presents a control framework on Lie groups by designing the control objective in its Lie algebra. Control on Lie groups is challenging due to its nonlinear nature and difficulties in system parameterization. Existing methods to design the control objective on a Lie group and then derive the gradient for controller design are non-trivial and can result in slow convergence in tracking control. We show that with a proper left-invariant metric, setting the gradient of the cost function as the tracking error in the Lie algebra leads to a quadratic Lyapunov function that enables globally exponential convergence. In the PD control case, we show that our controller can maintain an exponential convergence rate even when the initial error is approaching $\pi$ in SO(3). We also show the merit of this proposed framework in trajectory optimization. The proposed cost function enables the iterative Linear Quadratic Regulator (iLQR) to converge much faster than the Differential Dynamic Programming (DDP) with a well-adopted cost function when the initial trajectory is poorly initialized on SO(3).
Safety is critical in autonomous robotic systems. A safe control law ensures forward invariance of a safe set (a subset in the state space). It has been extensively studied regarding how to derive a safe control law with a control-affine analytical dynamic model. However, in complex environments and tasks, it is challenging and time-consuming to obtain a principled analytical model of the system. In these situations, data-driven learning is extensively used and the learned models are encoded in neural networks. How to formally derive a safe control law with Neural Network Dynamic Models (NNDM) remains unclear due to the lack of computationally tractable methods to deal with these black-box functions. In fact, even finding the control that minimizes an objective for NNDM without any safety constraint is still challenging. In this work, we propose MIND-SIS (Mixed Integer for Neural network Dynamic model with Safety Index Synthesis), the first method to derive safe control laws for NNDM. The method includes two parts: 1) SIS: an algorithm for the offline synthesis of the safety index (also called as barrier function), which uses evolutionary methods and 2) MIND: an algorithm for online computation of the optimal and safe control signal, which solves a constrained optimization using a computationally efficient encoding of neural networks. It has been theoretically proved that MIND-SIS guarantees forward invariance and finite convergence. And it has been numerically validated that MIND-SIS achieves safe and optimal control of NNDM. From our experiments, the optimality gap is less than $10^{-8}$, and the safety constraint violation is $0$.
In this paper we get error bounds for fully discrete approximations of infinite horizon problems via the dynamic programming approach. It is well known that considering a time discretization with a positive step size $h$ an error bound of size $h$ can be proved for the difference between the value function (viscosity solution of the Hamilton-Jacobi-Bellman equation corresponding to the infinite horizon) and the value function of the discrete time problem. However, including also a spatial discretization based on elements of size $k$ an error bound of size $O(k/h)$ can be found in the literature for the error between the value functions of the continuous problem and the fully discrete problem. In this paper we revise the error bound of the fully discrete method and prove, under similar assumptions to those of the time discrete case, that the error of the fully discrete case is in fact $O(h+k)$ which gives first order in time and space for the method. This error bound matches the numerical experiments of many papers in the literature in which the behaviour $1/h$ from the bound $O(k/h)$ have not been observed.
We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems, which we call dynamical dimension reduction (DDR). In the DDR model, each point is evolved via a nonlinear flow towards a lower-dimensional subspace; the projection onto the subspace gives the low-dimensional embedding. Training the model involves identifying the nonlinear flow and the subspace. Following the equation discovery method, we represent the vector field that defines the flow using a linear combination of dictionary elements, where each element is a pre-specified linear/nonlinear candidate function. A regularization term for the average total kinetic energy is also introduced and motivated by optimal transport theory. We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method. We also show how the DDR method can be trained using a gradient-based optimization method, where the gradients are computed using the adjoint method from optimal control theory. The DDR method is implemented and compared on synthetic and example datasets to other dimension reductions methods, including PCA, t-SNE, and Umap.
The metriplectic formalism is useful for describing complete dynamical systems which conserve energy and produce entropy. This creates challenges for model reduction, as the elimination of high-frequency information will generally not preserve the metriplectic structure which governs long-term stability of the system. Based on proper orthogonal decomposition, a provably convergent metriplectic reduced-order model is formulated which is guaranteed to maintain the algebraic structure necessary for energy conservation and entropy formation. Numerical results on benchmark problems show that the proposed method is remarkably stable, leading to improved accuracy over long time scales at a moderate increase in cost over naive methods.
Stability certification and identification of the stabilizable operating region of a dynamical system are two important concerns to ensure its operational safety/security and robustness. With the advent of machine-learning tools, these issues are especially important for systems with machine-learned components in the feedback loop. Here, in presence of unknown discrete variation (DV) of its parameters within a bounded range, a system controlled by a static feedback controller in which the closed-loop (CL) equilibria are subject to variation-induced drift is equivalently represented using a class of time-invariant systems, each with the same control policy. To develop a general theory for stability and stabilizability of such a class of neural-network (NN) controlled nonlinear systems, a Lyapunov-based convex stability certificate is proposed and is further used to devise an estimate of a local Lipschitz upper bound for the NN and a corresponding operating domain in the state space containing an initialization set, starting from where the CL local asymptotic stability of each system in the class is guaranteed, while the trajectory of the original system remains confined to the domain if the DV of the parameters satisfies a certain quasi-stationarity condition. To compute such a robustly stabilizing NN controller, a stability-guaranteed training (SGT) algorithm is also proposed. The effectiveness of the proposed framework is demonstrated using illustrative examples.
Federated Learning has promised a new approach to resolve the challenges in machine learning by bringing computation to the data. The popularity of the approach has led to rapid progress in the algorithmic aspects and the emergence of systems capable of simulating Federated Learning. State of art systems in Federated Learning support a single node aggregator that is insufficient to train a large corpus of devices or train larger-sized models. As the model size or the number of devices increase the single node aggregator incurs memory and computation burden while performing fusion tasks. It also faces communication bottlenecks when a large number of model updates are sent to a single node. We classify the workload for the aggregator into categories and propose a new aggregation service for handling each load. Our aggregation service is based on a holistic approach that chooses the best solution depending on the model update size and the number of clients. Our system provides a fault-tolerant, robust and efficient aggregation solution utilizing existing parallel and distributed frameworks. Through evaluation, we show the shortcomings of the state of art approaches and how a single solution is not suitable for all aggregation requirements. We also provide a comparison of current frameworks with our system through extensive experiments.
While the theoretical analysis of evolutionary algorithms (EAs) has made significant progress for pseudo-Boolean optimization problems in the last 25 years, only sporadic theoretical results exist on how EAs solve permutation-based problems. To overcome the lack of permutation-based benchmark problems, we propose a general way to transfer the classic pseudo-Boolean benchmarks into benchmarks defined on sets of permutations. We then conduct a rigorous runtime analysis of the permutation-based $(1+1)$ EA proposed by Scharnow, Tinnefeld, and Wegener (2004) on the analogues of the \textsc{LeadingOnes} and \textsc{Jump} benchmarks. The latter shows that, different from bit-strings, it is not only the Hamming distance that determines how difficult it is to mutate a permutation $\sigma$ into another one $\tau$, but also the precise cycle structure of $\sigma \tau^{-1}$. For this reason, we also regard the more symmetric scramble mutation operator. We observe that it not only leads to simpler proofs, but also reduces the runtime on jump functions with odd jump size by a factor of $\Theta(n)$. Finally, we show that a heavy-tailed version of the scramble operator, as in the bit-string case, leads to a speed-up of order $m^{\Theta(m)}$ on jump functions with jump size~$m$.%
Bayesian model selection provides a powerful framework for objectively comparing models directly from observed data, without reference to ground truth data. However, Bayesian model selection requires the computation of the marginal likelihood (model evidence), which is computationally challenging, prohibiting its use in many high-dimensional Bayesian inverse problems. With Bayesian imaging applications in mind, in this work we present the proximal nested sampling methodology to objectively compare alternative Bayesian imaging models for applications that use images to inform decisions under uncertainty. The methodology is based on nested sampling, a Monte Carlo approach specialised for model comparison, and exploits proximal Markov chain Monte Carlo techniques to scale efficiently to large problems and to tackle models that are log-concave and not necessarily smooth (e.g., involving l_1 or total-variation priors). The proposed approach can be applied computationally to problems of dimension O(10^6) and beyond, making it suitable for high-dimensional inverse imaging problems. It is validated on large Gaussian models, for which the likelihood is available analytically, and subsequently illustrated on a range of imaging problems where it is used to analyse different choices of dictionary and measurement model.
Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.