Many computer vision applications require robust and efficient estimation of camera geometry from a minimal number of input data measurements, i.e., solving minimal problems in a RANSAC framework. Minimal problems are usually formulated as complex systems of sparse polynomials. The systems usually are overdetermined and consist of polynomials with algebraically constrained coefficients. Most state-of-the-art efficient polynomial solvers are based on the action matrix method that has been automated and highly optimized in recent years. On the other hand, the alternative theory of sparse resultants and Newton polytopes has been less successful for generating efficient solvers, primarily because the polytopes do not respect the constraints on the coefficients. Therefore, in this paper, we propose a simple iterative scheme to test various subsets of the Newton polytopes and search for the most efficient solver. Moreover, we propose to use an extra polynomial with a special form to further improve the solver efficiency via a Schur complement computation. We show that for some camera geometry problems our extra polynomial-based method leads to smaller and more stable solvers than the state-of-the-art Grobner basis-based solvers. The proposed method can be fully automated and incorporated into existing tools for automatic generation of efficient polynomial solvers. It provides a competitive alternative to popular Grobner basis-based methods for minimal problems in computer vision. We also study the conditions under which the minimal solvers generated by the state-of-the-art action matrix-based methods and the proposed extra polynomial resultant-based method, are equivalent. Specifically we consider a step-by-step comparison between the approaches based on the action matrix and the sparse resultant, followed by a set of substitutions, which would lead to equivalent minimal solvers.
We present a new framework for modelling multivariate extremes, based on an angular-radial representation of the probability density function. Under this representation, the problem of modelling multivariate extremes is transformed to that of modelling an angular density and the tail of the radial variable, conditional on angle. Motivated by univariate theory, we assume that the tail of the conditional radial distribution converges to a generalised Pareto (GP) distribution. To simplify inference, we also assume that the angular density is continuous and finite and the GP parameter functions are continuous with angle. We refer to the resulting model as the semi-parametric angular-radial (SPAR) model for multivariate extremes. We consider the effect of the choice of polar coordinate system and introduce generalised concepts of angular-radial coordinate systems and generalised scalar angles in two dimensions. We show that under certain conditions, the choice of polar coordinate system does not affect the validity of the SPAR assumptions. However, some choices of coordinate system lead to simpler representations. In contrast, we show that the choice of margin does affect whether the model assumptions are satisfied. In particular, the use of Laplace margins results in a form of the density function for which the SPAR assumptions are satisfied for many common families of copula, with various dependence classes. We show that the SPAR model provides a more versatile framework for characterising multivariate extremes than provided by existing approaches, and that several commonly-used approaches are special cases of the SPAR model. Moreover, the SPAR framework provides a means of characterising all `extreme regions' of a joint distribution using a single inference. Applications in which this is useful are discussed.
Rational function approximations provide a simple but flexible alternative to polynomial approximation, allowing one to capture complex non-linearities without oscillatory artifacts. However, there have been few attempts to use rational functions on noisy data due to the likelihood of creating spurious singularities. To avoid the creation of singularities, we use Bernstein polynomials and appropriate conditions on their coefficients to force the denominator to be strictly positive. While this reduces the range of rational polynomials that can be expressed, it keeps all the benefits of rational functions while maintaining the robustness of polynomial approximation in noisy data scenarios. Our numerical experiments on noisy data show that existing rational approximation methods continually produce spurious poles inside the approximation domain. This contrasts our method, which cannot create poles in the approximation domain and provides better fits than a polynomial approximation and even penalized splines on functions with multiple variables. Moreover, guaranteeing pole-free in an interval is critical for estimating non-constant coefficients when numerically solving differential equations using spectral methods. This provides a compact representation of the original differential equation, allowing numeric solvers to achieve high accuracy quickly, as seen in our experiments.
This paper presents an application of artificial intelligence on mass spectrometry data for detecting habitability potential of ancient Mars. Although data was collected for planet Mars the same approach can be replicated for any terrestrial object of our solar system. Furthermore, proposed methodology can be adapted to any domain that uses mass spectrometry. This research is focused in data analysis of two mass spectrometry techniques, evolved gas analysis (EGA-MS) and gas chromatography (GC-MS), which are used to identify specific chemical compounds in geological material samples. The study demonstrates the applicability of EGA-MS and GC-MS data to extra-terrestrial material analysis. Most important features of proposed methodology includes square root transformation of mass spectrometry values, conversion of raw data to 2D sprectrograms and utilization of specific machine learning models and techniques to avoid overfitting on relative small datasets. Both EGA-MS and GC-MS datasets come from NASA and two machine learning competitions that the author participated and exploited. Complete running code for the GC-MS dataset/competition is available at GitHub.1 Raw training mass spectrometry data include [0, 1] labels of specific chemical compounds, selected to provide valuable insights and contribute to our understanding of the potential past habitability of Mars.
Friction drag from a turbulent fluid moving past or inside an object plays a crucial role in domains as diverse as transportation, public utility infrastructure, energy technology, and human health. As a direct measure of the shear-induced friction forces, an accurate prediction of the wall-shear stress can contribute to sustainability, conservation of resources, and carbon neutrality in civil aviation as well as enhanced medical treatment of vascular diseases and cancer. Despite such importance for our modern society, we still lack adequate experimental methods to capture the instantaneous wall-shear stress dynamics. In this contribution, we present a holistic approach that derives velocity and wall-shear stress fields with impressive spatial and temporal resolution from flow measurements using a deep optical flow estimator with physical knowledge. The validity and physical correctness of the derived flow quantities is demonstrated with synthetic and real-world experimental data covering a range of relevant fluid flows.
HyperNetX (HNX) is an open source Python library for the analysis and visualization of complex network data modeled as hypergraphs. Initially released in 2019, HNX facilitates exploratory data analysis of complex networks using algebraic topology, combinatorics, and generalized hypergraph and graph theoretical methods on structured data inputs. With its 2023 release, the library supports attaching metadata, numerical and categorical, to nodes (vertices) and hyperedges, as well as to node-hyperedge pairings (incidences). HNX has a customizable Matplotlib-based visualization module as well as HypernetX-Widget, its JavaScript addon for interactive exploration and visualization of hypergraphs within Jupyter Notebooks. Both packages are available on GitHub and PyPI. With a growing community of users and collaborators, HNX has become a preeminent tool for hypergraph analysis.
We study signals that are sparse in graph spectral domain and develop explicit algorithms to reconstruct the support set as well as partial components from samples on few vertices of the graph. The number of required samples is independent of the total size of the graph and takes only local properties of the graph into account. Our results rely on an operator based framework for subspace methods and become effective when the spectral eigenfunctions are zero-free or linear independent on small sets of the vertices. The latter has recently been adressed using algebraic methods by the first author.
Context: The combination of distributed stream processing with microservice architectures is an emerging pattern for building data-intensive software systems. In such systems, stream processing frameworks such as Apache Flink, Apache Kafka Streams, Apache Samza, Hazelcast Jet, or the Apache Beam SDK are used inside microservices to continuously process massive amounts of data in a distributed fashion. While all of these frameworks promote scalability as a core feature, there is only little empirical research evaluating and comparing their scalability. Objective: The goal of this study to obtain evidence about the scalability of state-of-the-art stream processing framework in different execution environments and regarding different scalability dimensions. Method: We benchmark five modern stream processing frameworks regarding their scalability using a systematic method. We conduct over 740 hours of experiments on Kubernetes clusters in the Google cloud and in a private cloud, where we deploy up to 110 simultaneously running microservice instances, which process up to one million messages per second. Results: All benchmarked frameworks exhibit approximately linear scalability as long as sufficient cloud resources are provisioned. However, the frameworks show considerable differences in the rate at which resources have to be added to cope with increasing load. There is no clear superior framework, but the ranking of the frameworks depends on the use case. Using Apache Beam as an abstraction layer still comes at the cost of significantly higher resource requirements regardless of the use case. We observe our results regardless of scaling load on a microservice, scaling the computational work performed inside the microservice, and the selected cloud environment. Moreover, vertical scaling can be a complementary measure to achieve scalability of stream processing frameworks.
Finding the optimal design of experiments in the Bayesian setting typically requires estimation and optimization of the expected information gain functional. This functional consists of one outer and one inner integral, separated by the logarithm function applied to the inner integral. When the mathematical model of the experiment contains uncertainty about the parameters of interest and nuisance uncertainty, (i.e., uncertainty about parameters that affect the model but are not themselves of interest to the experimenter), two inner integrals must be estimated. Thus, the already considerable computational effort required to determine good approximations of the expected information gain is increased further. The Laplace approximation has been applied successfully in the context of experimental design in various ways, and we propose two novel estimators featuring the Laplace approximation to alleviate the computational burden of both inner integrals considerably. The first estimator applies Laplace's method followed by a Laplace approximation, introducing a bias. The second estimator uses two Laplace approximations as importance sampling measures for Monte Carlo approximations of the inner integrals. Both estimators use Monte Carlo approximation for the remaining outer integral estimation. We provide three numerical examples demonstrating the applicability and effectiveness of our proposed estimators.
We describe the implementation of the Giudici-Green Metropolis sampling method for decomposable graphs using a variety of structures to represent the graph. These comprise the graph itself, the Junction tree, the Almond tree and the Ibarra clique-separator graph. For each structure, we describe the process for ascertaining whether adding or deleting a specific edge results in a new graph that is also decomposable, and the updates that need to be made to the structure if the edge perturbation is made. For the Almond tree and Ibarra graph these procedures are novel. We find that using the graph itself is generally at least competitive in terms of computational efficiency for a variety of graph distributions, but note that the other structures may allow and suggest samplers using different perturbations with lower rejection rates and/or better mixing properties. The sampler has applications in estimating graphical models for systems of multivariate Gaussian or Multinomial variables.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.