State-space models (SSMs) are a powerful statistical tool for modelling time-varying systems via a latent state. In these models, the latent state is never directly observed. Instead, a sequence of data points related to the state are obtained. The linear-Gaussian state-space model is widely used, since it allows for exact inference when all model parameters are known, however this is rarely the case. The estimation of these parameters is a very challenging but essential task to perform inference and prediction. In the linear-Gaussian model, the state dynamics are described via a state transition matrix. This model parameter is known to behard to estimate, since it encodes the relationships between the state elements, which are never observed. In many applications, this transition matrix is sparse since not all state components directly affect all other state components. However, most parameter estimation methods do not exploit this feature. In this work we propose SpaRJ, a fully probabilistic Bayesian approach that obtains sparse samples from the posterior distribution of the transition matrix. Our method explores sparsity by traversing a set of models that exhibit differing sparsity patterns in the transition matrix. Moreover, we also design new effective rules to explore transition matrices within the same level of sparsity. This novel methodology has strong theoretical guarantees, and unveils the latent structure of the data generating process, thereby enhancing interpretability. The performance of SpaRJ is showcased in example with dimension 144 in the parameter space, and in a numerical example with real data.
Within the statistical literature, there is a lack of methods that allow for asymmetric multivariate spatial effects to model relations underlying complex spatial phenomena. Intercropping is one such phenomenon. In this ancient agricultural practice multiple crop species or varieties are cultivated together in close proximity and are subject to mutual competition. To properly analyse such a system, it is necessary to account for both within- and between-plot effects, where between-plot effects are asymmetric. Building on the multivariate spatial autoregressive model and the Gaussian graphical model, the proposed method takes asymmetric spatial relations into account, thereby removing some of the limiting factors of spatial analyses and giving researchers a better indication of the existence and extend of spatial relationships. Using a Bayesian-estimation framework, the model shows promising results in the simulation study. The model is applied on intercropping data consisting of Belgian endive and beetroot, illustrating the usage of the proposed methodology. An R package containing the proposed methodology can be found on // CRAN.R-project.org/package=SAGM.
Time-space diagrams are essential tools for analyzing traffic patterns and optimizing transportation infrastructure and traffic management strategies. Traditional data collection methods for these diagrams have limitations in terms of temporal and spatial coverage. Recent advancements in camera technology have overcome these limitations and provided extensive urban data. In this study, we propose an innovative approach to constructing time-space diagrams by utilizing street-view video sequences captured by cameras mounted on moving vehicles. Using the state-of-the-art YOLOv5, StrongSORT, and photogrammetry techniques for distance calculation, we can infer vehicle trajectories from the video data and generate time-space diagrams. To evaluate the effectiveness of our proposed method, we utilized datasets from the KITTI computer vision benchmark suite. The evaluation results demonstrate that our approach can generate trajectories from video data, although there are some errors that can be mitigated by improving the performance of the detector, tracker, and distance calculation components. In conclusion, the utilization of street-view video sequences captured by cameras mounted on moving vehicles, combined with state-of-the-art computer vision techniques, has immense potential for constructing comprehensive time-space diagrams. These diagrams offer valuable insights into traffic patterns and contribute to the design of transportation infrastructure and traffic management strategies.
Bayesian inference paradigms are regarded as powerful tools for solution of inverse problems. However, when applied to inverse problems in physical sciences, Bayesian formulations suffer from a number of inconsistencies that are often overlooked. A well known, but mostly neglected, difficulty is connected to the notion of conditional probability densities. Borel, and later Kolmogorov's (1933/1956), found that the traditional definition of conditional densities is incomplete: In different parameterizations it leads to different results. We will show an example where two apparently correct procedures applied to the same problem lead to two widely different results. Another type of inconsistency involves violation of causality. This problem is found in model selection strategies in Bayesian inversion, such as Hierarchical Bayes and Trans-Dimensional Inversion where so-called hyperparameters are included as variables to control either the number (or type) of unknowns, or the prior uncertainties on data or model parameters. For Hierarchical Bayes we demonstrate that the calculated 'prior' distributions of data or model parameters are not prior-, but posterior information. In fact, the calculated 'standard deviations' of the data are a measure of the inability of the forward function to model the data, rather than uncertainties of the data. For trans-dimensional inverse problems we show that the so-called evidence is, in fact, not a measure of the success of fitting the data for the given choice (or number) of parameters, as often claimed. We also find that the notion of Natural Parsimony is ill-defined, because of its dependence on the parameter prior. Based on this study, we find that careful rethinking of Bayesian inversion practices is required, with special emphasis on ways of avoiding the Borel-Kolmogorov inconsistency, and on the way we interpret model selection results.
Critical points mark locations in the domain where the level-set topology of a scalar function undergoes fundamental changes and thus indicate potentially interesting features in the data. Established methods exist to locate and relate such points in a deterministic setting, but it is less well understood how the concept of critical points can be extended to the analysis of uncertain data. Most methods for this task aim at finding likely locations of critical points or estimate the probability of their occurrence locally but do not indicate if critical points at potentially different locations in different realizations of a stochastic process are manifestations of the same feature, which is required to characterize the spatial uncertainty of critical points. Previous work on relating critical points across different realizations reported challenges for interpreting the resulting spatial distribution of critical points but did not investigate the causes. In this work, we provide a mathematical formulation of the problem of finding critical points with spatial uncertainty and computing their spatial distribution, which leads us to the notion of uncertain critical points. We analyze the theoretical properties of these structures and highlight connections to existing works for special classes of uncertain fields. We derive conditions under which well-interpretable results can be obtained and discuss the implications of those restrictions for the field of visualization. We demonstrate that the discussed limitations are not purely academic but also arise in real-world data.
Foundation models, such as OpenAI's GPT-3 and GPT-4, Meta's LLaMA, and Google's PaLM2, have revolutionized the field of artificial intelligence. A notable paradigm shift has been the advent of the Segment Anything Model (SAM), which has exhibited a remarkable capability to segment real-world objects, trained on 1 billion masks and 11 million images. Although SAM excels in general object segmentation, it lacks the intrinsic ability to detect salient objects, resulting in suboptimal performance in this domain. To address this challenge, we present the Segment Salient Object Model (SSOM), an innovative approach that adaptively fine-tunes SAM for salient object detection by harnessing the low-rank structure inherent in deep learning. Comprehensive qualitative and quantitative evaluations across five challenging RGB benchmark datasets demonstrate the superior performance of our approach, surpassing state-of-the-art methods.
Novel view synthesis and 3D modeling using implicit neural field representation are shown to be very effective for calibrated multi-view cameras. Such representations are known to benefit from additional geometric and semantic supervision. Most existing methods that exploit additional supervision require dense pixel-wise labels or localized scene priors. These methods cannot benefit from high-level vague scene priors provided in terms of scenes' descriptions. In this work, we aim to leverage the geometric prior of Manhattan scenes to improve the implicit neural radiance field representations. More precisely, we assume that only the knowledge of the indoor scene (under investigation) being Manhattan is known -- with no additional information whatsoever -- with an unknown Manhattan coordinate frame. Such high-level prior is used to self-supervise the surface normals derived explicitly in the implicit neural fields. Our modeling allows us to cluster the derived normals and exploit their orthogonality constraints for self-supervision. Our exhaustive experiments on datasets of diverse indoor scenes demonstrate the significant benefit of the proposed method over the established baselines. The source code will be available at //github.com/nikola3794/normal-clustering-nerf.
Even though the analysis of unsteady 2D flow fields is challenging, fluid mechanics experts generally have an intuition on where in the simulation domain specific features are expected. Using this intuition, showing similar regions enables the user to discover flow patterns within the simulation data. When focusing on similarity, a solid mathematical framework for a specific flow pattern is not required. We propose a technique that visualizes similar and dissimilar regions with respect to a region selected by the user. Using infinitesimal strain theory, we capture the strain and rotation progression and therefore the dynamics of fluid parcels along pathlines, which we encode as distributions. We then apply the Jensen-Shannon divergence to compute the (dis)similarity between pathline dynamics originating in a user-defined flow region and the pathline dynamics of the flow field. We validate our method by applying it to two simulation datasets of two-dimensional unsteady flows. Our results show that our approach is suitable for analyzing the similarity of time-dependent flow fields.
Generative models, as an important family of statistical modeling, target learning the observed data distribution via generating new instances. Along with the rise of neural networks, deep generative models, such as variational autoencoders (VAEs) and generative adversarial network (GANs), have made tremendous progress in 2D image synthesis. Recently, researchers switch their attentions from the 2D space to the 3D space considering that 3D data better aligns with our physical world and hence enjoys great potential in practice. However, unlike a 2D image, which owns an efficient representation (i.e., pixel grid) by nature, representing 3D data could face far more challenges. Concretely, we would expect an ideal 3D representation to be capable enough to model shapes and appearances in details, and to be highly efficient so as to model high-resolution data with fast speed and low memory cost. However, existing 3D representations, such as point clouds, meshes, and recent neural fields, usually fail to meet the above requirements simultaneously. In this survey, we make a thorough review of the development of 3D generation, including 3D shape generation and 3D-aware image synthesis, from the perspectives of both algorithms and more importantly representations. We hope that our discussion could help the community track the evolution of this field and further spark some innovative ideas to advance this challenging task.
Graphs are used widely to model complex systems, and detecting anomalies in a graph is an important task in the analysis of complex systems. Graph anomalies are patterns in a graph that do not conform to normal patterns expected of the attributes and/or structures of the graph. In recent years, graph neural networks (GNNs) have been studied extensively and have successfully performed difficult machine learning tasks in node classification, link prediction, and graph classification thanks to the highly expressive capability via message passing in effectively learning graph representations. To solve the graph anomaly detection problem, GNN-based methods leverage information about the graph attributes (or features) and/or structures to learn to score anomalies appropriately. In this survey, we review the recent advances made in detecting graph anomalies using GNN models. Specifically, we summarize GNN-based methods according to the graph type (i.e., static and dynamic), the anomaly type (i.e., node, edge, subgraph, and whole graph), and the network architecture (e.g., graph autoencoder, graph convolutional network). To the best of our knowledge, this survey is the first comprehensive review of graph anomaly detection methods based on GNNs.
Object detection is a fundamental task in computer vision and image processing. Current deep learning based object detectors have been highly successful with abundant labeled data. But in real life, it is not guaranteed that each object category has enough labeled samples for training. These large object detectors are easy to overfit when the training data is limited. Therefore, it is necessary to introduce few-shot learning and zero-shot learning into object detection, which can be named low-shot object detection together. Low-Shot Object Detection (LSOD) aims to detect objects from a few or even zero labeled data, which can be categorized into few-shot object detection (FSOD) and zero-shot object detection (ZSD), respectively. This paper conducts a comprehensive survey for deep learning based FSOD and ZSD. First, this survey classifies methods for FSOD and ZSD into different categories and discusses the pros and cons of them. Second, this survey reviews dataset settings and evaluation metrics for FSOD and ZSD, then analyzes the performance of different methods on these benchmarks. Finally, this survey discusses future challenges and promising directions for FSOD and ZSD.