This paper proposes a method for calibrating control parameters. Examples of such control parameters are gains of PID controllers, weights of a cost function for optimal control, filter coefficients, the sliding surface of a sliding mode controller, or weights of a neural network. Hence, the proposed method can be applied to a wide range of controllers. The method uses a Kalman filter that estimates control parameters rather than the system's state, using data of closed-loop system operation. The control parameter calibration is driven by a training objective, which encompasses specifications on the performance of the dynamical system. The calibration method tunes the parameters online and robustly, is computationally efficient, has low data storage requirements, and is easy to implement making it appealing for many real-time applications. Simulation results show that the method is able to learn control parameters quickly (approximately 24% average decay factor of closed-loop cost), is able to tune the parameters to compensate for disturbances (approximately 29% improvement on tracking precision), and is robust to noise. Further, a simulation study with the high-fidelity vehicle simulator CarSim shows that the method can calibrate controllers of a complex dynamical system online, which indicates its applicability to a real-world system.
Brain-computer interfaces (BCIs) still face many challenges to step out of laboratories to be used in real-life applications. A key one persists in the high performance control of diverse effectors for complex tasks, using chronic and safe recorders. This control must be robust over time and of high decoding performance without continuous recalibration of the decoders. In the article, asynchronous control of an exoskeleton by a tetraplegic patient using a chronically implanted epidural electrocorticography (EpiCoG) implant is demonstrated. For this purpose, an adaptive online tensor-based decoder: the Recursive Exponentially Weighted Markov-Switching multi-Linear Model (REW-MSLM) was developed. We demonstrated over a period of 6 months the stability of the 8-dimensional alternative bimanual control of the exoskeleton and its virtual avatar using REW-MSLM without recalibration of the decoder.
Millimeter-wave self-backhauled small cells are a key component of next-generation wireless networks. Their dense deployment will increase data rates, reduce latency, and enable efficient data transport between the access and backhaul networks, providing greater flexibility not previously possible with optical fiber. Despite their high potential, operating dense self-backhauled networks optimally is an open challenge, particularly for radio resource management (RRM). This paper presents, RadiOrchestra, a holistic RRM framework that models and optimizes beamforming, rate selection as well as user association and admission control for self-backhauled networks. The framework is designed to account for practical challenges such as hardware limitations of base stations (e.g., computational capacity, discrete rates), the need for adaptability of backhaul links, and the presence of interference. Our framework is formulated as a nonconvex mixed-integer nonlinear program, which is challenging to solve. To approach this problem, we propose three algorithms that provide a trade-off between complexity and optimality. Furthermore, we derive upper and lower bounds to characterize the performance limits of the system. We evaluate the developed strategies in various scenarios, showing the feasibility of deploying practical self-backhauling in future networks.
Although the theory of constrained least squares (CLS) estimation is well known, it is usually applied with the view that the constraints to be imposed are unavoidable. However, there are cases in which constraints are optional. For example, in camera color calibration, one of several possible color processing systems is obtained if a constraint on the row sums of a desired color correction matrix is imposed; in this example, it is not clear a priori whether imposing the constraint leads to better system performance. In this paper, we derive an exact expression connecting the constraint to the increase in fitting error obtained from imposing it. As another contribution, we show how to determine projection matrices that separate the measured data into two components: the first component drives up the fitting error due to imposing a constraint, and the second component is unaffected by the constraint. We demonstrate the use of these results in the color calibration problem.
Owing to their superior modeling capabilities, gated Recurrent Neural Networks, such as Gated Recurrent Units (GRUs) and Long Short-Term Memory networks (LSTMs), have become popular tools for learning dynamical systems. This paper aims to discuss how these networks can be adopted for the synthesis of Internal Model Control (IMC) architectures. To this end, first a gated recurrent network is used to learn a model of the unknown input-output stable plant. Then, a controller gated recurrent network is trained to approximate the model inverse. The stability of these networks, ensured by means of a suitable training procedure, allows to guarantee the input-output closed-loop stability. The proposed scheme is able to cope with the saturation of the control variables, and can be deployed on low-power embedded controllers, as it requires limited online computations. The approach is then tested on the Quadruple Tank benchmark system and compared to alternative control laws, resulting in remarkable closed-loop performances.
We demonstrate the effectiveness of an adaptive explicit Euler method for the approximate solution of the Cox-Ingersoll-Ross model. This relies on a class of path-bounded timestepping strategies which work by reducing the stepsize as solutions approach a neighbourhood of zero. The method is hybrid in the sense that a convergent backstop method is invoked if the timestep becomes too small, or to prevent solutions from overshooting zero and becoming negative. Under parameter constraints that imply Feller's condition, we prove that such a scheme is strongly convergent, of order at least 1/2. Control of the strong error is important for multi-level Monte Carlo techniques. Under Feller's condition we also prove that the probability of ever needing the backstop method to prevent a negative value can be made arbitrarily small. Numerically, we compare this adaptive method to fixed step implicit and explicit schemes, and a novel semi-implicit adaptive variant. We observe that the adaptive approach leads to methods that are competitive in a domain that extends beyond Feller's condition, indicating suitability for the modelling of stochastic volatility in Heston-type asset models.
The paper provides a novel framework to study the accuracy and stability of numerical integration schemes when employed for the time domain simulation of power systems. A matrix pencil-based approach is adopted to evaluate the error between the dynamic modes of the power system and the modes of the approximated discrete-time system arising from the application of the numerical method. The proposed approach can provide meaningful insights on how different methods compare to each other when applied to a power system, while being general enough to be systematically utilized for, in principle, any numerical method. The framework is illustrated for a handful of well-known explicit and implicit methods, while simulation results are presented based on the WSCC 9-bus system, as well as on a 1, 479-bus dynamic model of the All-Island Irish Transmission System.
In model extraction attacks, adversaries can steal a machine learning model exposed via a public API by repeatedly querying it and adjusting their own model based on obtained predictions. To prevent model stealing, existing defenses focus on detecting malicious queries, truncating, or distorting outputs, thus necessarily introducing a tradeoff between robustness and model utility for legitimate users. Instead, we propose to impede model extraction by requiring users to complete a proof-of-work before they can read the model's predictions. This deters attackers by greatly increasing (even up to 100x) the computational effort needed to leverage query access for model extraction. Since we calibrate the effort required to complete the proof-of-work to each query, this only introduces a slight overhead for regular users (up to 2x). To achieve this, our calibration applies tools from differential privacy to measure the information revealed by a query. Our method requires no modification of the victim model and can be applied by machine learning practitioners to guard their publicly exposed models against being easily stolen.
Many real-world problems require one to estimate parameters of interest, in a Bayesian framework, from data that are collected sequentially in time. Conventional methods for sampling from posterior distributions, such as {Markov Chain Monte Carlo} can not efficiently address such problems as they do not take advantage of the data's sequential structure. To this end, sequential methods which seek to update the posterior distribution whenever a new collection of data become available are often used to solve these types of problems. Two popular choices of sequential method are the Ensemble Kalman filter (EnKF) and the sequential Monte Carlo sampler (SMCS). While EnKF only computes a Gaussian approximation of the posterior distribution, SMCS can draw samples directly from the posterior. Its performance, however, depends critically upon the kernels that are used. In this work, we present a method that constructs the kernels of SMCS using an EnKF formulation, and we demonstrate the performance of the method with numerical examples.
In recent years, domain randomization has gained a lot of traction as a method for sim-to-real transfer of reinforcement learning policies in robotic manipulation; however, finding optimal randomization distributions can be difficult. In this paper, we introduce DROPO, a novel method for estimating domain randomization distributions for safe sim-to-real transfer. Unlike prior work, DROPO only requires a limited, precollected offline dataset of trajectories, and explicitly models parameter uncertainty to match real data. We demonstrate that DROPO is capable of recovering dynamic parameter distributions in simulation and finding a distribution capable of compensating for an unmodelled phenomenon. We also evaluate the method in two zero-shot sim-to-real transfer scenarios, showing successful domain transfer and improved performance over prior methods.
How to obtain good value estimation is one of the key problems in Reinforcement Learning (RL). Current value estimation methods, such as DDPG and TD3, suffer from unnecessary over- or underestimation bias. In this paper, we explore the potential of double actors, which has been neglected for a long time, for better value function estimation in continuous setting. First, we uncover and demonstrate the bias alleviation property of double actors by building double actors upon single critic and double critics to handle overestimation bias in DDPG and underestimation bias in TD3 respectively. Next, we interestingly find that double actors help improve the exploration ability of the agent. Finally, to mitigate the uncertainty of value estimate from double critics, we further propose to regularize the critic networks under double actors architecture, which gives rise to Double Actors Regularized Critics (DARC) algorithm. Extensive experimental results on challenging continuous control tasks show that DARC significantly outperforms state-of-the-art methods with higher sample efficiency.