The ever-growing use of wind energy makes necessary the optimization of turbine operations through pitch angle controllers and their maintenance with early fault detection. It is crucial to have accurate and robust models imitating the behavior of wind turbines, especially to predict the generated power as a function of the wind speed. Existing empirical and physics-based models have limitations in capturing the complex relations between the input variables and the power, aggravated by wind variability. Data-driven methods offer new opportunities to enhance wind turbine modeling of large datasets by improving accuracy and efficiency. In this study, we used physics-informed neural networks to reproduce historical data coming from 4 turbines in a wind farm, while imposing certain physical constraints to the model. The developed models for regression of the power, torque, and power coefficient as output variables showed great accuracy for both real data and physical equations governing the system. Lastly, introducing an efficient evidential layer provided uncertainty estimations of the predictions, proved to be consistent with the absolute error, and made possible the definition of a confidence interval in the power curve.
Many modern discontinuous Galerkin (DG) methods for conservation laws make use of summation by parts operators and flux differencing to achieve kinetic energy preservation or entropy stability. While these techniques increase the robustness of DG methods significantly, they are also computationally more demanding than standard weak form nodal DG methods. We present several implementation techniques to improve the efficiency of flux differencing DG methods that use tensor product quadrilateral or hexahedral elements, in 2D or 3D respectively. Focus is mostly given to CPUs and DG methods for the compressible Euler equations, although these techniques are generally also useful for other physical systems including the compressible Navier-Stokes and magnetohydrodynamics equations. We present results using two open source codes, Trixi.jl written in Julia and FLUXO written in Fortran, to demonstrate that our proposed implementation techniques are applicable to different code bases and programming languages.
Ordinary differential equations (ODEs) are widely used to model complex dynamics that arises in biology, chemistry, engineering, finance, physics, etc. Calibration of a complicated ODE system using noisy data is generally very difficult. In this work, we propose a two-stage nonparametric approach to address this problem. We first extract the de-noised data and their higher order derivatives using boundary kernel method, and then feed them into a sparsely connected deep neural network with ReLU activation function. Our method is able to recover the ODE system without being subject to the curse of dimensionality and complicated ODE structure. When the ODE possesses a general modular structure, with each modular component involving only a few input variables, and the network architecture is properly chosen, our method is proven to be consistent. Theoretical properties are corroborated by an extensive simulation study that demonstrates the validity and effectiveness of the proposed method. Finally, we use our method to simultaneously characterize the growth rate of Covid-19 infection cases from 50 states of the USA.
Neuro-evolutionary methods have proven effective in addressing a wide range of tasks. However, the study of the robustness and generalisability of evolved artificial neural networks (ANNs) has remained limited. This has immense implications in the fields like robotics where such controllers are used in control tasks. Unexpected morphological or environmental changes during operation can risk failure if the ANN controllers are unable to handle these changes. This paper proposes an algorithm that aims to enhance the robustness and generalisability of the controllers. This is achieved by introducing morphological variations during the evolutionary process. As a results, it is possible to discover generalist controllers that can handle a wide range of morphological variations sufficiently without the need of the information regarding their morphologies or adaptation of their parameters. We perform an extensive experimental analysis on simulation that demonstrates the trade-off between specialist and generalist controllers. The results show that generalists are able to control a range of morphological variations with a cost of underperforming on a specific morphology relative to a specialist. This research contributes to the field by addressing the limited understanding of robustness and generalisability in neuro-evolutionary methods and proposes a method by which to improve these properties.
Anomalous diffusion in the presence or absence of an external force field is often modelled in terms of the fractional evolution equations, which can involve the hyper-singular source term. For this case, conventional time stepping methods may exhibit a severe order reduction. Although a second-order numerical algorithm is provided for the subdiffusion model with a simple hyper-singular source term $t^{\mu}$, $-2<\mu<-1$ in [arXiv:2207.08447], the convergence analysis remain to be proved. To fill in these gaps, we present a simple and robust smoothing method for the hyper-singular source term, where the Hadamard finite-part integral is introduced. This method is based on the smoothing/ID$m$-BDF$k$ method proposed by the authors [Shi and Chen, SIAM J. Numer. Anal., to appear] for subdiffusion equation with a weakly singular source term. We prove that the $k$th-order convergence rate can be restored for the diffusion-wave case $\gamma \in (1,2)$ and sketch the proof for the subdiffusion case $\gamma \in (0,1)$, even if the source term is hyper-singular and the initial data is not compatible. Numerical experiments are provided to confirm the theoretical results.
We study the machine learning task for models with operators mapping between the Wasserstein space of probability measures and a space of functions, like e.g. in mean-field games/control problems. Two classes of neural networks, based on bin density and on cylindrical approximation, are proposed to learn these so-called mean-field functions, and are theoretically supported by universal approximation theorems. We perform several numerical experiments for training these two mean-field neural networks, and show their accuracy and efficiency in the generalization error with various test distributions. Finally, we present different algorithms relying on mean-field neural networks for solving time-dependent mean-field problems, and illustrate our results with numerical tests for the example of a semi-linear partial differential equation in the Wasserstein space of probability measures.
A change point detection (CPD) framework assisted by a predictive machine learning model called "Predict and Compare" is introduced and characterised in relation to other state-of-the-art online CPD routines which it outperforms in terms of false positive rate and out-of-control average run length. The method's focus is on improving standard methods from sequential analysis such as the CUSUM rule in terms of these quality measures. This is achieved by replacing typically used trend estimation functionals such as the running mean with more sophisticated predictive models (Predict step), and comparing their prognosis with actual data (Compare step). The two models used in the Predict step are the ARIMA model and the LSTM recursive neural network. However, the framework is formulated in general terms, so as to allow the use of other prediction or comparison methods than those tested here. The power of the method is demonstrated in a tribological case study in which change points separating the run-in, steady-state, and divergent wear phases are detected in the regime of very few false positives.
Grid maps, especially occupancy grid maps, are ubiquitous in many mobile robot applications. To simplify the process of learning the map, grid maps subdivide the world into a grid of cells, whose occupancies are independently estimated using only measurements in the perceptual field of the particular cell. However, the world consists of objects that span multiple cells, which means that measurements falling onto a cell provide evidence on the occupancy of other cells belonging to the same object. This correlation is not captured by current models. In this work, we present a way to generalize the update of grid maps relaxing the assumption of independence by modeling the relationship between the measurements and the occupancy of each cell as a set of latent variables, and jointly estimating those variables and the posterior of the map. Additionally, we propose a method to estimate the latent variables by clustering based on semantic labels and an extension to the Normal Distributions Transfer Occupancy Map (NDT-OM) to facilitate the proposed map update method. We perform comprehensive experiments of map creation and localization with real world data sets, and show that the proposed method creates better maps in highly dynamic environments compared to state-of-the-art methods. Finally, we demonstrate the ability of the proposed method to remove occluded objects from the map in a lifelong map update scenario.
Atmospheric systems incorporating thermal dynamics must be stable with respect to both energy and entropy. While energy conservation can be enforced via the preservation of the skew-symmetric structure of the Hamiltonian form of the equations of motion, entropy conservation is typically derived as an additional invariant of the Hamiltonian system, and satisfied via the exact preservation of the chain rule. This is particularly challenging since the function spaces used to represent the thermodynamic variables in compatible finite element discretisations are typically discontinuous at element boundaries. In the present work we negate this problem by constructing our equations of motion via weighted averages of skew-symmetric formulations using both flux form and material form advection of thermodynamic variables, which allow for the necessary cancellations required to conserve entropy without the chain rule. We show that such formulations allow for stable simulations of both the thermal shallow water and 3D compressible Euler equations on the sphere using mixed compatible finite elements without entropy damping.
This work uses the entropy-regularised relaxed stochastic control perspective as a principled framework for designing reinforcement learning (RL) algorithms. Herein agent interacts with the environment by generating noisy controls distributed according to the optimal relaxed policy. The noisy policies on the one hand, explore the space and hence facilitate learning but, on the other hand, introduce bias by assigning a positive probability to non-optimal actions. This exploration-exploitation trade-off is determined by the strength of entropy regularisation. We study algorithms resulting from two entropy regularisation formulations: the exploratory control approach, where entropy is added to the cost objective, and the proximal policy update approach, where entropy penalises policy divergence between consecutive episodes. We focus on the finite horizon continuous-time linear-quadratic (LQ) RL problem, where a linear dynamics with unknown drift coefficients is controlled subject to quadratic costs. In this setting, both algorithms yield a Gaussian relaxed policy. We quantify the precise difference between the value functions of a Gaussian policy and its noisy evaluation and show that the execution noise must be independent across time. By tuning the frequency of sampling from relaxed policies and the parameter governing the strength of entropy regularisation, we prove that the regret, for both learning algorithms, is of the order $\mathcal{O}(\sqrt{N}) $ (up to a logarithmic factor) over $N$ episodes, matching the best known result from the literature.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.