The Loewner framework is an interpolatory approach designed for approximating linear and nonlinear systems. The goal here is to extend this framework to linear parametric systems with an arbitrary number n of parameters. One main innovation established here is the construction of data-based realizations for any number of parameters. Equally importantly, we show how to alleviate the computational burden, by avoiding the explicit construction of large-scale n-dimensional Loewner matrices of size $N \times N$. This reduces the complexity from $O(N^3)$ to about $O(N^{1.4})$, thus taming the curse of dimensionality and making the solution scalable to very large data sets. To achieve this, a new generalized multivariate rational function realization is defined. Then, we introduce the n-dimensional multivariate Loewner matrices and show that they can be computed by solving a coupled set of Sylvester equations. The null space of these Loewner matrices then allows the construction of the multivariate barycentric transfer function. The principal result of this work is to show how the null space of the n-dimensional Loewner matrix can be computed using a sequence of 1-dimensional Loewner matrices, leading to a drastic computational burden reduction. Finally, we suggest two algorithms (one direct and one iterative) to construct, directly from data, multivariate (or parametric) realizations ensuring (approximate) interpolation. Numerical examples highlight the effectiveness and scalability of the method.
A visualized graph is a powerful tool for data analysis and synthesis tasks. In this case, the task of visualization constitutes not only in displaying vertices and edges according to the graph representation, but also in ensuring that the result is visually simple and comprehensible for a human. Thus, the visualization process involves solving several problems, one of which is the problem of constructing a topological drawing of a planar part of a non-planar graph with a minimum number of removed edges. In this manuscript, we consider a mathematical model for representing the topological drawing of a graph, which is based on methods of the theory of vertex rotation with the induction of simple cycles that satisfy the Mac Lane planarity criterion. It is shown that the topological drawing of a non-planar graph can be constructed on the basis of a selected planar part of the graph. The topological model of a graph drawing allows us to reduce the brute-force enumeration problem of identifying a plane graph to a discrete optimization problem - searching for a subset of the set of isometric cycles of the graph that satisfy the zero value of the Mac Lane's functional. To isolate the planar part of the graph, a new computational method has been developed based on linear algebra and the algebra of structural numbers. The proposed method has polynomial computational complexity.
Consistent and holistic expression of software requirements is important for the success of software projects. In this study, we aim to enhance the efficiency of the software development processes by automatically identifying conflicting and duplicate software requirement specifications. We formulate the conflict and duplicate detection problem as a requirement pair classification task. We design a novel transformers-based architecture, SR-BERT, which incorporates Sentence-BERT and Bi-encoders for the conflict and duplicate identification task. Furthermore, we apply supervised multi-stage fine-tuning to the pre-trained transformer models. We test the performance of different transfer models using four different datasets. We find that sequentially trained and fine-tuned transformer models perform well across the datasets with SR-BERT achieving the best performance for larger datasets. We also explore the cross-domain performance of conflict detection models and adopt a rule-based filtering approach to validate the model classifications. Our analysis indicates that the sentence pair classification approach and the proposed transformer-based natural language processing strategies can contribute significantly to achieving automation in conflict and duplicate detection
LiDAR-based SLAM is a core technology for autonomous vehicles and robots. Despite the intense research activity in this field, each proposed system uses a particular sensor post-processing pipeline and a single map representation format. The present work aims at introducing a revolutionary point of view for 3D LiDAR SLAM and localization: (1) using view-based maps as the fundamental representation of maps ("simple-maps"), which can then be used to generate arbitrary metric maps optimized for particular tasks; and (2) by introducing a new framework in which mapping pipelines can be defined without coding, defining the connections of a network of reusable blocks much like deep-learning networks are designed by connecting layers of standardized elements. Moreover, the idea of including the current linear and angular velocity vectors as variables to be optimized within the ICP loop is also introduced, leading to superior robustness against aggressive motion profiles without an IMU. The presented open-source ecosystem, released to ROS 2, includes tools and prebuilt pipelines covering all the way from data acquisition to map editing and visualization, real-time localization, loop-closure detection, or map georeferencing from consumer-grade GNSS receivers. Extensive experimental validation reveals that the proposal compares well to, or improves, former state-of-the-art (SOTA) LiDAR odometry systems, while also successfully mapping some hard sequences where others diverge. A proposed self-adaptive configuration has been used, without parameter changes, for all 3D LiDAR datasets with sensors between 16 and 128 rings, extensively tested on 83 sequences over more than 250~km of automotive, hand-held, airborne, and quadruped LiDAR datasets, both indoors and outdoors. The open-sourced implementation is available online at //github.com/MOLAorg/mola
Approximate Bayesian computation (ABC) is the most popular approach to inferring parameters in the case where the data model is specified in the form of a simulator. It is not possible to directly implement standard Monte Carlo methods for inference in such a model, due to the likelihood not being available to evaluate pointwise. The main idea of ABC is to perform inference on an alternative model with an approximate likelihood (the ABC likelihood), estimated at each iteration from points simulated from the data model. The central challenge of ABC is then to trade-off bias (introduced by approximating the model) with the variance introduced by estimating the ABC likelihood. Stabilising the variance of the ABC likelihood requires a computational cost that is exponential in the dimension of the data, thus the most common approach to reducing variance is to perform inference conditional on summary statistics. In this paper we introduce a new approach to estimating the ABC likelihood: using iterative ensemble Kalman inversion (IEnKI) (Iglesias, 2016; Iglesias et al., 2018). We first introduce new estimators of the marginal likelihood in the case of a Gaussian data model using the IEnKI output, then show how this may be used in ABC. Performance is illustrated on the Lotka-Volterra model, where we observe substantial improvements over standard ABC and other commonly-used approaches.
The most common approach to causal modelling is the potential outcomes framework due to Neyman and Rubin. In this framework, outcomes of counterfactual treatments are assumed to be well-defined. This metaphysical assumption is often thought to be problematic yet indispensable. The conventional approach relies not only on counterfactuals, but also on abstract notions of distributions and assumptions of independence that are not directly testable. In this paper, we construe causal inference as treatment-wise predictions for finite populations where all assumptions are testable; this means that one can not only test predictions themselves (without any fundamental problem), but also investigate sources of error when they fail. The new framework highlights the model-dependence of causal claims as well as the difference between statistical and scientific inference.
While building machine learning models, Feature selection (FS) stands out as an essential preprocessing step used to handle the uncertainty and vagueness in the data. Recently, the minimum Redundancy and Maximum Relevance (mRMR) approach has proven to be effective in obtaining the irredundant feature subset. Owing to the generation of voluminous datasets, it is essential to design scalable solutions using distributed/parallel paradigms. MapReduce solutions are proven to be one of the best approaches to designing fault-tolerant and scalable solutions. This work analyses the existing MapReduce approaches for mRMR feature selection and identifies the limitations thereof. In the current study, we proposed VMR_mRMR, an efficient vertical partitioning-based approach using a memorization approach, thereby overcoming the extant approaches limitations. The experiment analysis says that VMR_mRMR significantly outperformed extant approaches and achieved a better computational gain (C.G). In addition, we also conducted a comparative analysis with the horizontal partitioning approach HMR_mRMR [1] to assess the strengths and limitations of the proposed approach.
A spatiotemporal deep learning framework is proposed that is capable of 2D full-field prediction of fracture in concrete mesostructures. This framework not only predicts fractures but also captures the entire history of the fracture process, from the crack initiation in the interfacial transition zone to the subsequent propagation of the cracks in the mortar matrix. In addition, a convolutional neural network is developed which can predict the averaged stress-strain curve of the mesostructures. The UNet modeling framework, which comprises an encoder-decoder section with skip connections, is used as the deep learning surrogate model. Training and test data are generated from high-fidelity fracture simulations of randomly generated concrete mesostructures. These mesostructures include geometric variabilities such as different aggregate particle geometrical features, spatial distribution, and the total volume fraction of aggregates. The fracture simulations are carried out in Abaqus, utilizing the cohesive phase-field fracture modeling technique as the fracture modeling approach. In this work, to reduce the number of training datasets, the spatial distribution of three sets of material properties for three-phase concrete mesostructures, along with the spatial phase-field damage index, are fed to the UNet to predict the corresponding stress and spatial damage index at the subsequent step. It is shown that after the training process using this methodology, the UNet model is capable of accurately predicting damage on the unseen test dataset by using 470 datasets. Moreover, another novel aspect of this work is the conversion of irregular finite element data into regular grids using a developed pipeline. This approach allows for the implementation of less complex UNet architecture and facilitates the integration of phase-field fracture equations into surrogate models for future developments.
We propose and study a one-dimensional model which consists of two cross-diffusion systems coupled via a moving interface. The motivation stems from the modelling of complex diffusion processes in the context of the vapor deposition of thin films. In our model, cross-diffusion of the various chemical species can be respectively modelled by a size-exclusion system for the solid phase and the Stefan-Maxwell system for the gaseous phase. The coupling between the two phases is modelled by linear phase transition laws of Butler-Volmer type, resulting in an interface evolution. The continuous properties of the model are investigated, in particular its entropy variational structure and stationary states. We introduce a two-point flux approximation finite volume scheme. The moving interface is addressed with a moving-mesh approach, where the mesh is locally deformed around the interface. The resulting discrete nonlinear system is shown to admit a solution that preserves the main properties of the continuous system, namely: mass conservation, nonnegativity, volume-filling constraints, decay of the free energy and asymptotics. In particular, the moving-mesh approach is compatible with the entropy structure of the continuous model. Numerical results illustrate these properties and the dynamics of the model.
The Bicameral Cache is a cache organization proposal for a vector architecture that segregates data according to their access type, distinguishing scalar from vector references. Its aim is to avoid both types of references from interfering in each other's data locality, with a special focus on prioritizing the performance on vector references. The proposed system incorporates an additional, non-polluting prefetching mechanism to help populate the long vector cache lines in advance to increase the hit rate by further exploiting the spatial locality on vector data. Its evaluation was conducted on the Cavatools simulator, comparing the performance to a standard conventional cache, over different typical vector benchmarks for several vector lengths. The results proved the proposed cache speeds up performance on stride-1 vector benchmarks, while hardly impacting non-stride-1's. In addition, the prefetching feature consistently provided an additional value.
The paper presents a framework for online learning of the Koopman operator using streaming data. Many complex systems for which data-driven modeling and control are sought provide streaming sensor data, the abundance of which can present computational challenges but cannot be ignored. Streaming data can intermittently sample dynamically different regimes or rare events which could be critical to model and control. Using ideas from subspace identification, we present a method where the Grassmannian distance between the subspace of an extended observability matrix and the streaming segment of data is used to assess the `novelty' of the data. If this distance is above a threshold, it is added to an archive and the Koopman operator is updated if not it is discarded. Therefore, our method identifies data from segments of trajectories of a dynamical system that are from different dynamical regimes, prioritizes minimizing the amount of data needed in updating the Koopman model and furthermore reduces the number of basis functions by learning them adaptively. Therefore, by dynamically adjusting the amount of data used and learning basis functions, our method optimizes the model's accuracy and the system order.