The prediction of stochastic dynamical systems and the capture of dynamical behaviors are profound problems. In this article, we propose a data-driven framework combining Reservoir Computing and Normalizing Flow to study this issue, which mimics error modeling to improve the traditional Reservoir Computing performance and takes advantage of both approaches. This model-free method successfully predicts the long-term evolution of stochastic dynamical systems and replicates dynamical behaviors. With few assumptions about the underlying stochastic dynamical systems, we deal with Markov/non-Markov and stationary/non-stationary stochastic processes defined by linear/nonlinear stochastic differential equations or stochastic delay differential equations. We verify the effectiveness of the proposed framework in five experiments, including the Ornstein-Uhlenbeck process, Double-Well system, El Ni\~no Southern Oscillation simplified model, and stochastic Lorenz system. Additionally, we explore the noise-induced tipping phenomena and the replication of the strange attractor.
Suitable representations of dynamical systems can simplify their analysis and control. On this line of thought, this paper considers the input decoupling problem for input-affine Lagrangian dynamics, namely the problem of finding a transformation of the generalized coordinates that decouples the input channels. We identify a class of systems for which this problem is solvable. Such systems are called collocated because the decoupling variables correspond to the coordinates on which the actuators directly perform work. Under mild conditions on the input matrix, a simple test is presented to verify whether a system is collocated or not. By exploiting power invariance, it is proven that a change of coordinates decouples the input channels if and only if the dynamics is collocated. We illustrate the theoretical results by considering several Lagrangian systems, focusing on underactuated mechanical systems, for which novel controllers that exploit input decoupling are designed.
In this paper, we investigate the two-dimensional extension of a recently introduced set of shallow water models based on a regularized moment expansion of the incompressible Navier-Stokes equations \cite{kowalski2017moment,koellermeier2020analysis}. We show the rotational invariance of the proposed moment models with two different approaches. The first proof involves the split of the coefficient matrix into the conservative and non-conservative parts and prove the rotational invariance for each part, while the second one relies on the special block structure of the coefficient matrices. With the aid of rotational invariance, the analysis of the hyperbolicity for the moment model in 2D is reduced to the real diagonalizability of the coefficient matrix in 1D. Then we prove the real diagonalizability by deriving the analytical form of the characteristic polynomial. Furthermore, we extend the model to include a more general class of closure relations than the original model and establish that this set of general closure relations retain both rotational invariance and hyperbolicity.
Hybrid dynamical systems, i.e. systems that have both continuous and discrete states, are ubiquitous in engineering, but are difficult to work with due to their discontinuous transitions. For example, a robot leg is able to exert very little control effort while it is in the air compared to when it is on the ground. When the leg hits the ground, the penetrating velocity instantaneously collapses to zero. These instantaneous changes in dynamics and discontinuities (or jumps) in state make standard smooth tools for planning, estimation, control, and learning difficult for hybrid systems. One of the key tools for accounting for these jumps is called the saltation matrix. The saltation matrix is the sensitivity update when a hybrid jump occurs and has been used in a variety of fields including robotics, power circuits, and computational neuroscience. This paper presents an intuitive derivation of the saltation matrix and discusses what it captures, where it has been used in the past, how it is used for linear and quadratic forms, how it is computed for rigid body systems with unilateral constraints, and some of the structural properties of the saltation matrix in these cases.
We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and distributionally robust optimization problems, two dominant modeling paradigms in optimization under uncertainty. Numerical results for a synthetic newsvendor problem illustrate the key differences between alternative training schemes. We also investigate an economic dispatch problem based on real data to showcase the impact of the neural network architecture of the decision maps on their test performance.
This paper introduces an approach to decoupling singularly perturbed boundary value problems for fourth-order ordinary differential equations that feature a small positive parameter $\epsilon$ multiplying the highest derivative. We specifically examine Lidstone boundary conditions and demonstrate how to break down fourth-order differential equations into a system of second-order problems, with one lacking the parameter and the other featuring $\epsilon$ multiplying the highest derivative. To solve this system, we propose a mixed finite element algorithm and incorporate the Shishkin mesh scheme to capture the solution near boundary layers. Our solver is both direct and of high accuracy, with computation time that scales linearly with the number of grid points. We present numerical results to validate the theoretical results and the accuracy of our method.
In this paper, we study the estimation of the derivative of a regression function in a standard univariate regression model. The estimators are defined either by derivating nonparametric least-squares estimators of the regression function or by estimating the projection of the derivative. We prove two simple risk bounds allowing to compare our estimators. More elaborate bounds under a stability assumption are then provided. Bases and spaces on which we can illustrate our assumptions and first results are both of compact or non compact type, and we discuss the rates reached by our estimators. They turn out to be optimal in the compact case. Lastly, we propose a model selection procedure and prove the associated risk bound. To consider bases with a non compact support makes the problem difficult.
Capturing stochastic behaviors in business and work processes is essential to quantitatively understand how nondeterminism is resolved when taking decisions within the process. This is of special interest in process mining, where event data tracking the actual execution of the process are related to process models, and can then provide insights on frequencies and probabilities. Variants of stochastic Petri nets provide a natural formal basis for this. However, when capturing processes, such nets need to be labelled with (possibly duplicated) activities, and equipped with silent transitions that model internal, non-logged steps related to the orchestration of the process. At the same time, they have to be analyzed in a finite-trace semantics, matching the fact that each process execution consists of finitely many steps. These two aspects impede the direct application of existing techniques for stochastic Petri nets, calling for a novel characterization that incorporates labels and silent transitions in a finite-trace semantics. In this article, we provide such a characterization starting from generalized stochastic Petri nets and obtaining the framework of labelled stochastic processes (LSPs). On top of this framework, we introduce different key analysis tasks on the traces of LSPs and their probabilities. We show that all such analysis tasks can be solved analytically, in particular reducing them to a single method that combines automata-based techniques to single out the behaviors of interest within a LSP, with techniques based on absorbing Markov chains to reason on their probabilities. Finally, we demonstrate the significance of how our approach in the context of stochastic conformance checking, illustrating practical feasibility through a proof-of-concept implementation and its application to different datasets.
Detecting change-points in data is challenging because of the range of possible types of change and types of behaviour of data when there is no change. Statistically efficient methods for detecting a change will depend on both of these features, and it can be difficult for a practitioner to develop an appropriate detection method for their application of interest. We show how to automatically generate new offline detection methods based on training a neural network. Our approach is motivated by many existing tests for the presence of a change-point being representable by a simple neural network, and thus a neural network trained with sufficient data should have performance at least as good as these methods. We present theory that quantifies the error rate for such an approach, and how it depends on the amount of training data. Empirical results show that, even with limited training data, its performance is competitive with the standard CUSUM-based classifier for detecting a change in mean when the noise is independent and Gaussian, and can substantially outperform it in the presence of auto-correlated or heavy-tailed noise. Our method also shows strong results in detecting and localising changes in activity based on accelerometer data.
We propose two novel unbiased estimators of the integral $\int_{[0,1]^{s}}f(u) du$ for a function $f$, which depend on a smoothness parameter $r\in\mathbb{N}$. The first estimator integrates exactly the polynomials of degrees $p<r$ and achieves the optimal error $n^{-1/2-r/s}$ (where $n$ is the number of evaluations of $f$) when $f$ is $r$ times continuously differentiable. The second estimator is computationally cheaper but it is restricted to functions that vanish on the boundary of $[0,1]^s$. The construction of the two estimators relies on a combination of cubic stratification and control ariates based on numerical derivatives. We provide numerical evidence that they show good performance even for moderate values of $n$.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.