Trending topics are often the result of the spreading of information between users of social networks. These special topics can be regarded as rumors. The spreading of a rumor is often studied with the same techniques as in epidemics spreading. It is common that many datasets may not have enough measured variables, so we propose a method for studying the general behavior of the spreaders by selecting estimated variables given by the deterministic model. In order to provide a good approximation, we implemented an adaptative neuro fuzzy inference system (ANFIS). So, in our numerical experimentations, a deterministic approach using SIR and SIRS models (with delay) for two different topics is used. Thus, the authors just applied the ANFIS model for their application and the deterministic model as the preprocessing input variable.
Executive Order (EO) 14028, "Improving the Nation's Cybersecurity", 12 May 2021, directs the National Institute of Standards and Technology (NIST) to recommend minimum standards for software testing within 60 days. This document describes eleven recommendations for software verification techniques as well as providing supplemental information about the techniques and references for further information. It recommends the following techniques: Threat modeling to look for design-level security issues Automated testing for consistency and to minimize human effort Static code scanning to look for top bugs Heuristic tools to look for possible hardcoded secrets Use of built-in checks and protections "Black box" test cases Code-based structural test cases Historical test cases Fuzzing Web app scanners, if applicable Address included code (libraries, packages, services) The document does not address the totality of software verification, but instead, recommends techniques that are broadly applicable and form the minimum standards. The document was developed by NIST in consultation with the National Security Agency (NSA). Additionally, we received input from numerous outside organizations through papers submitted to a NIST workshop on the Executive Order held in early June 2021, discussion at the workshop, as well as follow up with several of the submitters.
To systematically validate the safe behavior of automated vehicles (AV), the aim of scenario-based testing is to cluster the infinite situations an AV might encounter into a finite set of functional scenarios. Every functional scenario, however, can still manifest itself in a vast amount of variations. Thus, metamodels are often used to perform analyses or to select specific variations for examination. However, despite the safety criticalness of AV testing, metamodels are usually seen as a part of an overall approach, and their predictions are not further examined. In this paper, we analyze the predictive performance of Gaussian processes (GP), deep Gaussian processes, extra-trees (ET), and Bayesian neural networks (BNN), considering four scenarios with 5 to 20 inputs. Building on this, we introduce and evaluate an iterative approach to efficiently select test cases. Our results show that regarding predictive performance, the appropriate selection of test cases is more important than the choice of metamodels. While their great flexibility allows BNNs to benefit from large amounts of data and to model even the most complex scenarios, less flexible models like GPs can convince with higher reliability. This implies that relevant test cases have to be explored using scalable virtual environments and flexible models so that more realistic test environments and more trustworthy models can be used for targeted testing and validation.
In this chapter, we show how to efficiently model high-dimensional extreme peaks-over-threshold events over space in complex non-stationary settings, using extended latent Gaussian Models (LGMs), and how to exploit the fitted model in practice for the computation of long-term return levels. The extended LGM framework assumes that the data follow a specific parametric distribution, whose unknown parameters are transformed using a multivariate link function and are then further modeled at the latent level in terms of fixed and random effects that have a joint Gaussian distribution. In the extremal context, we here assume that the data level distribution is described in terms of a Poisson point process likelihood, motivated by asymptotic extreme-value theory, and which conveniently exploits information from all threshold exceedances. This contrasts with the more common data-wasteful approach based on block maxima, which are typically modeled with the generalized extreme-value (GEV) distribution. When conditional independence can be assumed at the data level and latent random effects have a sparse probabilistic structure, fast approximate Bayesian inference becomes possible in very high dimensions, and we here present the recently proposed inference approach called "Max-and-Smooth", which provides exceptional speed-up compared to alternative methods. The proposed methodology is illustrated by application to satellite-derived precipitation data over Saudi Arabia, obtained from the Tropical Rainfall Measuring Mission, with 2738 grid cells and about 20 million spatio-temporal observations in total. Our fitted model captures the spatial variability of extreme precipitation satisfactorily and our results show that the most intense precipitation events are expected near the south-western part of Saudi Arabia, along the Red Sea coastline.
This paper introduces, for the first time to our knowledge, physics-informed neural networks to accurately estimate the AC-OPF result and delivers rigorous guarantees about their performance. Power system operators, along with several other actors, are increasingly using Optimal Power Flow (OPF) algorithms for a wide number of applications, including planning and real-time operations. However, in its original form, the AC Optimal Power Flow problem is often challenging to solve as it is non-linear and non-convex. Besides the large number of approximations and relaxations, recent efforts have also been focusing on Machine Learning approaches, especially neural networks. So far, however, these approaches have only partially considered the wide number of physical models available during training. And, more importantly, they have offered no guarantees about potential constraint violations of their output. Our approach (i) introduces the AC power flow equations inside neural network training and (ii) integrates methods that rigorously determine and reduce the worst-case constraint violations across the entire input domain, while maintaining the optimality of the prediction. We demonstrate how physics-informed neural networks achieve higher accuracy and lower constraint violations than standard neural networks, and show how we can further reduce the worst-case violations for all neural networks.
Non-Gaussian spatial and spatial-temporal data are becoming increasingly prevalent, and their analysis is needed in a variety of disciplines, such as those involving small-area demographics or global satellite remote sensing. FRK is an R package for spatial/spatio-temporal modelling and prediction with very large data sets that, to date, has only supported linear process models and Gaussian data models. In this paper, we describe a major upgrade to FRK that allows for non-Gaussian data to be analysed in a generalised linear mixed model framework. The existing functionality of FRK is retained with this advance into non-linear, non-Gaussian models; in particular, it allows for automatic basis-function construction, it can handle both point-referenced and areal data simultaneously, and it can predict process values at any spatial support from these data. These vastly more general spatial and spatio-temporal models are fitted using the Laplace approximation via the software TMB. This new version of FRK also allows for the use of a large number of basis functions when modelling the spatial process, and is thus often able to achieve more accurate predictions than previous versions of the package in a Gaussian setting. We demonstrate innovative features in this new version of FRK, highlight its ease of use, and compare it to alternative packages using both simulated and real data sets.
We present Probabilistic Decision Model and Notation (pDMN), a probabilistic extension of Decision Model and Notation (DMN). DMN is a modeling notation for deterministic decision logic, which intends to be user-friendly and low in complexity. pDMN extends DMN with probabilistic reasoning, predicates, functions, quantification, and a new hit policy. At the same time, it aims to retain DMN's user-friendliness to allow its usage by domain experts without the help of IT staff. pDMN models can be unambiguously translated into ProbLog programs to answer user queries. ProbLog is a probabilistic extension of Prolog flexibly enough to model and reason over any pDMN model.
Deep learning has been widely used within learning algorithms for robotics. One disadvantage of deep networks is that these networks are black-box representations. Therefore, the learned approximations ignore the existing knowledge of physics or robotics. Especially for learning dynamics models, these black-box models are not desirable as the underlying principles are well understood and the standard deep networks can learn dynamics that violate these principles. To learn dynamics models with deep networks that guarantee physically plausible dynamics, we introduce physics-inspired deep networks that combine first principles from physics with deep learning. We incorporate Lagrangian mechanics within the model learning such that all approximated models adhere to the laws of physics and conserve energy. Deep Lagrangian Networks (DeLaN) parametrize the system energy using two networks. The parameters are obtained by minimizing the squared residual of the Euler-Lagrange differential equation. Therefore, the resulting model does not require specific knowledge of the individual system, is interpretable, and can be used as a forward, inverse, and energy model. Previously these properties were only obtained when using system identification techniques that require knowledge of the kinematic structure. We apply DeLaN to learning dynamics models and apply these models to control simulated and physical rigid body systems. The results show that the proposed approach obtains dynamics models that can be applied to physical systems for real-time control. Compared to standard deep networks, the physics-inspired models learn better models and capture the underlying structure of the dynamics.
The tendency of a jet to stay attached to a flat or convex surface is called the Coand\u{a} effect and has many potential technical applications. The aim of this thesis is to assess how well Computational Fluid Dynamics can capture it. A Reynolds-Averaged Navier-Stokes approach with a 2-dimensional domain was first used to simulate an offset jet on a flat plane. Whether it was for $k-\omega$ SST or $k-\epsilon$ turbulence model, a good prediction of the flow was found. Since it is known that streamline curvature can have an important impact on the numerical results, a jet blown tangentially to a cylinder was then considered. Using the same approach as for the flat plane, with $k-\omega$ SST turbulence model, some of the flow features such as the separation location or velocity profiles near the jet exit were accurately predicted. However, the jet development was overall poorly captured. A Curvature Correction was then introduced in the turbulence model and if it did slightly improve the jet development, the negative impact on other quantities makes its benefits questionable. Due to the known presence of longitudinal and spanwise vortices in the flow, a Reynolds-Averaged Navier-Stokes approach with a 3-dimensional domain was attempted but was only able to reproduce the 2-dimensional results. Although if the longitudinal vortices are artificially generated at the inlet, their development is supported by the simulation. Finally, since the shortcomings of the numerical results obtained might be due to limitations of the Reynolds-Averaged Navier-Stokes approach, a Large Eddy Simulation was attempted. Unfortunately, due to time and computational restrictions a fully developed flow could not be obtained, the methodology and preliminary results are however presented.
We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.
Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research. In recent years, deep neural networks have achieved significant success in named entity recognition and many other Natural Language Processing (NLP) tasks. Most of these algorithms are trained end to end, and can automatically learn features from large scale labeled datasets. However, these data-driven methods typically lack the capability of processing rare or unseen entities. Previous statistical methods and feature engineering practice have demonstrated that human knowledge can provide valuable information for handling rare and unseen cases. In this paper, we address the problem by incorporating dictionaries into deep neural networks for the Chinese CNER task. Two different architectures that extend the Bi-directional Long Short-Term Memory (Bi-LSTM) neural network and five different feature representation schemes are proposed to handle the task. Computational results on the CCKS-2017 Task 2 benchmark dataset show that the proposed method achieves the highly competitive performance compared with the state-of-the-art deep learning methods.