Closure spaces, a generalisation of topological spaces, have shown to be a convenient theoretical framework for spatial model checking. The closure operator of closure spaces and quasi-discrete closure spaces induces a notion of neighborhood akin to that of topological spaces that build on open sets. For closure models and quasi-discrete closure models, in this paper we present three notions of bisimilarity that are logically characterised by corresponding modal logics with spatial modalities: (i) CM-bisimilarity for closure models (CMs) is shown to generalise Topo-bisimilarity for topological models. CM-bisimilarity corresponds to equivalence with respect to the infinitary modal logic IML that includes the modality ${\cal N}$ for ``being near''. (ii) CMC-bisimilarity, with `CMC' standing for CM-bisimilarity with converse, refines CM-bisimilarity for quasi-discrete closure spaces, carriers of quasi-discrete closure models. Quasi-discrete closure models come equipped with two closure operators, Direct ${\cal C}$ and Converse ${\cal C}$, stemming from the binary relation underlying closure and its converse. CMC-bisimilarity, is captured by the infinitary modal logic IMLC including two modalities, Direct ${\cal N}$ and Converse ${\cal N}$, corresponding to the two closure operators. (iii) CoPa-bisimilarity on quasi-discrete closure models, which is weaker than CMC-bisimilarity, is based on the notion of compatible paths. The logical counterpart of CoPa-bisimilarity is the infinitary modal logic ICRL with modalities Direct $\zeta$ and Converse $\zeta$, whose semantics relies on forward and backward paths, respectively. It is shown that CoPa-bisimilarity for quasi-discrete closure models relates to divergence-blind stuttering equivalence for Kripke structures.
Answering complex logical queries on incomplete knowledge graphs (KGs) is a fundamental and challenging task in multi-hop reasoning. Recent work defines this task as an end-to-end optimization problem, which significantly reduces the training cost and enhances the generalization of the model by a pretrained link predictors for query answering. However, most existing proposals ignore the critical semantic knowledge inherently available in KGs, such as type information, which could help answer complex logical queries. To this end, we propose TypE-based Neural Link Prediction Adapter (TENLPA), a novel model that constructs type-based entity-relation graphs to discover the latent relationships between entities and relations by leveraging type information in KGs. Meanwhile, in order to effectively combine type information with complex logical queries, an adaptive learning mechanism is introduced, which is trained by back-propagating during the complex query answering process to achieve adaptive adjustment of neural link predictors. Experiments on 3 standard datasets show that TENLPA model achieves state-of-the-art performance on complex query answering with good generalization and robustness.
In the field of causal modeling, potential outcomes (PO) and structural causal models (SCMs) stand as the predominant frameworks. However, these frameworks face notable challenges in practically modeling counterfactuals, formalized as parameters of the joint distribution of potential outcomes. Counterfactual reasoning holds paramount importance in contemporary decision-making processes, especially in scenarios that demand personalized incentives based on the joint values of $(Y(0), Y(1))$. This paper begins with an investigation of the PO and SCM frameworks for modeling counterfactuals. Through the analysis, we identify an inherent model capacity limitation, termed as the ``degenerative counterfactual problem'', emerging from the consistency rule that is the cornerstone of both frameworks. To address this limitation, we introduce a novel \textit{distribution-consistency} assumption, and in alignment with it, we propose the Distribution-consistency Structural Causal Models (DiscoSCMs) offering enhanced capabilities to model counterfactuals. To concretely reveal the enhanced model capacity, we introduce a new identifiable causal parameter, \textit{the probability of consistency}, which holds practical significance within DiscoSCM alone, showcased with a personalized incentive example. Furthermore, we provide a comprehensive set of theoretical results about the ``Ladder of Causation'' within the DiscoSCM framework. We hope it opens new avenues for future research of counterfactual modeling, ultimately enhancing our understanding of causality and its real-world applications.
Refreshable tactile displays (RTDs) are predicted to soon become a viable option for the provision of accessible graphics for people who are blind or have low vision (BLV). This new technology for the tactile display of braille and graphics, usually using raised pins, makes it easier to generate and access a large number of graphics. However, it differs from existing tactile graphics in terms of scale, height and fidelity. Here, we share the perspectives of four key stakeholders -- blind touch readers, vision specialist teachers, accessible format producers and assistive technology providers -- to explore the potential uses, advantages and needs relating to the introduction of RTDs. We also provide advice on what role the data visualisation community can take to help ensure that people who are BLV are best able to benefit from the introduction of affordable RTDs.
The chain graph model admits both undirected and directed edges in one graph, where symmetric conditional dependencies are encoded via undirected edges and asymmetric causal relations are encoded via directed edges. Though frequently encountered in practice, the chain graph model has been largely under investigated in literature, possibly due to the lack of identifiability conditions between undirected and directed edges. In this paper, we first establish a set of novel identifiability conditions for the Gaussian chain graph model, exploiting a low rank plus sparse decomposition of the precision matrix. Further, an efficient learning algorithm is built upon the identifiability conditions to fully recover the chain graph structure. Theoretical analysis on the proposed method is conducted, assuring its asymptotic consistency in recovering the exact chain graph structure. The advantage of the proposed method is also supported by numerical experiments on both simulated examples and a real application on the Standard & Poor 500 index data.
The localization of objects is a crucial task in various applications such as robotics, virtual and augmented reality, and the transportation of goods in warehouses. Recent advances in deep learning have enabled the localization using monocular visual cameras. While structure from motion (SfM) predicts the absolute pose from a point cloud, absolute pose regression (APR) methods learn a semantic understanding of the environment through neural networks. However, both fields face challenges caused by the environment such as motion blur, lighting changes, repetitive patterns, and feature-less structures. This study aims to address these challenges by incorporating additional information and regularizing the absolute pose using relative pose regression (RPR) methods. RPR methods suffer under different challenges, i.e., motion blur. The optical flow between consecutive images is computed using the Lucas-Kanade algorithm, and the relative pose is predicted using an auxiliary small recurrent convolutional network. The fusion of absolute and relative poses is a complex task due to the mismatch between the global and local coordinate systems. State-of-the-art methods fusing absolute and relative poses use pose graph optimization (PGO) to regularize the absolute pose predictions using relative poses. In this work, we propose recurrent fusion networks to optimally align absolute and relative pose predictions to improve the absolute pose prediction. We evaluate eight different recurrent units and construct a simulation environment to pre-train the APR and RPR networks for better generalized training. Additionally, we record a large database of different scenarios in a challenging large-scale indoor environment that mimics a warehouse with transportation robots. We conduct hyperparameter searches and experiments to show the effectiveness of our recurrent fusion method compared to PGO.
The family of log-concave density functions contains various kinds of common probability distributions. Due to the shape restriction, it is possible to find the nonparametric estimate of the density, for example, the nonparametric maximum likelihood estimate (NPMLE). However, the associated uncertainty quantification of the NPMLE is less well developed. The current techniques for uncertainty quantification are Bayesian, using a Dirichlet process prior combined with the use of Markov chain Monte Carlo (MCMC) to sample from the posterior. In this paper, we start with the NPMLE and use a version of the martingale posterior distribution to establish uncertainty about the NPMLE. The algorithm can be implemented in parallel and hence is fast. We prove the convergence of the algorithm by constructing suitable submartingales. We also illustrate results with different models and settings and some real data, and compare our method with that within the literature.
Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.
Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.
Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.