Modern industrial fault diagnosis tasks often face the combined challenge of distribution discrepancy and bi-imbalance. Existing domain adaptation approaches pay little attention to the prevailing bi-imbalance, leading to poor domain adaptation performance or even negative transfer. In this work, we propose a self-degraded contrastive domain adaptation (Sd-CDA) diagnosis framework to handle the domain discrepancy under the bi-imbalanced data. It first pre-trains the feature extractor via imbalance-aware contrastive learning based on model pruning to learn the feature representation efficiently in a self-supervised manner. Then it forces the samples away from the domain boundary based on supervised contrastive domain adversarial learning (SupCon-DA) and ensures the features generated by the feature extractor are discriminative enough. Furthermore, we propose the pruned contrastive domain adversarial learning (PSupCon-DA) to pay automatically re-weighted attention to the minorities to enhance the performance towards bi-imbalanced data. We show the superiority of the proposed method via two experiments.
Causal networks are useful in a wide variety of applications, from medical diagnosis to root-cause analysis in manufacturing. In practice, however, causal networks are often incomplete with missing causal relations. This paper presents a novel approach, called CausalLP, that formulates the issue of incomplete causal networks as a knowledge graph completion problem. More specifically, the task of finding new causal relations in an incomplete causal network is mapped to the task of knowledge graph link prediction. The use of knowledge graphs to represent causal relations enables the integration of external domain knowledge; and as an added complexity, the causal relations have weights representing the strength of the causal association between entities in the knowledge graph. Two primary tasks are supported by CausalLP: causal explanation and causal prediction. An evaluation of this approach uses a benchmark dataset of simulated videos for causal reasoning, CLEVRER-Humans, and compares the performance of multiple knowledge graph embedding algorithms. Two distinct dataset splitting approaches are used for evaluation: (1) random-based split, which is the method typically employed to evaluate link prediction algorithms, and (2) Markov-based split, a novel data split technique that utilizes the Markovian property of causal relations. Results show that using weighted causal relations improves causal link prediction over the baseline without weighted relations.
Variance reduction for causal inference in the presence of network interference is often achieved through either outcome modeling, which is typically analyzed under unit-randomized Bernoulli designs, or clustered experimental designs, which are typically analyzed without strong parametric assumptions. In this work, we study the intersection of these two approaches and consider the problem of estimation in low-order outcome models using data from a general experimental design. Our contributions are threefold. First, we present an estimator of the total treatment effect (also called the global average treatment effect) in a low-degree outcome model when the data are collected under general experimental designs, generalizing previous results for Bernoulli designs. We refer to this estimator as the pseudoinverse estimator and give bounds on its bias and variance in terms of properties of the experimental design. Second, we evaluate these bounds for the case of cluster randomized designs with both Bernoulli and complete randomization. For clustered Bernoulli randomization, we find that our estimator is always unbiased and that its variance scales like the smaller of the variance obtained from a low-order assumption and the variance obtained from cluster randomization, showing that combining these variance reduction strategies is preferable to using either individually. For clustered complete randomization, we find a notable bias-variance trade-off mediated by specific features of the clustering. Third, when choosing a clustered experimental design, our bounds can be used to select a clustering from a set of candidate clusterings. Across a range of graphs and clustering algorithms, we show that our method consistently selects clusterings that perform well on a range of response models, suggesting that our bounds are useful to practitioners.
The broad class of multivariate unified skew-normal (SUN) distributions has been recently shown to possess important conjugacy properties. When used as priors for the vector of parameters in general probit, tobit, and multinomial probit models, these distributions yield posteriors that still belong to the SUN family. Although such a core result has led to important advancements in Bayesian inference and computation, its applicability beyond likelihoods associated with fully-observed, discretized, or censored realizations from multivariate Gaussian models remains yet unexplored. This article covers such an important gap by proving that the wider family of multivariate unified skew-elliptical (SUE) distributions, which extends SUNs to more general perturbations of elliptical densities, guarantees conjugacy for broader classes of models, beyond those relying on fully-observed, discretized or censored Gaussians. Such a result leverages the closure under linear combinations, conditioning and marginalization of SUE to prove that this family is conjugate to the likelihood induced by general multivariate regression models for fully-observed, censored or dichotomized realizations from skew-elliptical distributions. This advancement enlarges the set of models that enable conjugate Bayesian inference to general formulations arising from elliptical and skew-elliptical families, including the multivariate Student's t and skew-t, among others.
Desire line maps are widely deployed for traffic flow analysis by virtue of their ease of interpretation and computation. They can be considered to be simplified traffic flow maps, whereas the computational challenges in aggregating small scale traffic flows prevent the wider dissemination of high resolution flow maps. Vehicle trajectories are a promising data source to solve this challenging problem. The solution begins with the alignment (or map matching) of the trajectories to the road network. However even the state-of-the-art map matching implementations produce sub-optimal results with small misalignments. While these misalignments are negligible for large scale flow aggregation in desire line maps, they pose substantial obstacles for small scale flow aggregation in high resolution maps. To remove these remaining misalignments, we introduce innovative local alignment algorithms, where we infer road segments to serve as local reference segments, and proceed to align nearby road segments to them. With each local alignment iteration, the misalignments of the trajectories with each other and with the road network are reduced, and so converge closer to a minimal flow map. By analysing a set of empirical trajectories collected in Hannover, Germany, we confirm that our minimal flow map has high levels of spatial resolution, accuracy and coverage.
A common method for estimating the Hessian operator from random samples on a low-dimensional manifold involves locally fitting a quadratic polynomial. Although widely used, it is unclear if this estimator introduces bias, especially in complex manifolds with boundaries and nonuniform sampling. Rigorous theoretical guarantees of its asymptotic behavior have been lacking. We show that, under mild conditions, this estimator asymptotically converges to the Hessian operator, with nonuniform sampling and curvature effects proving negligible, even near boundaries. Our analysis framework simplifies the intensive computations required for direct analysis.
Error control by means of a posteriori error estimators or indica-tors and adaptive discretizations, such as adaptive mesh refinement, have emerged in the late seventies. Since then, numerous theoretical developments and improvements have been made, as well as the first attempts to introduce them into real-life industrial applications. The present introductory chapter provides an overview of the subject, highlights some of the achievements to date and discusses possible perspectives.
We propose a topological mapping and localization system able to operate on real human colonoscopies, despite significant shape and illumination changes. The map is a graph where each node codes a colon location by a set of real images, while edges represent traversability between nodes. For close-in-time images, where scene changes are minor, place recognition can be successfully managed with the recent transformers-based local feature matching algorithms. However, under long-term changes -- such as different colonoscopies of the same patient -- feature-based matching fails. To address this, we train on real colonoscopies a deep global descriptor achieving high recall with significant changes in the scene. The addition of a Bayesian filter boosts the accuracy of long-term place recognition, enabling relocalization in a previously built map. Our experiments show that ColonMapper is able to autonomously build a map and localize against it in two important use cases: localization within the same colonoscopy or within different colonoscopies of the same patient. Code: //github.com/jmorlana/ColonMapper.
The brains of all bilaterally symmetric animals on Earth are divided into left and right hemispheres. The anatomy and functionality of the hemispheres have a large degree of overlap, but there are asymmetries, and they specialise in possesses different attributes. Other authors have used computational models to mimic hemispheric asymmetries with a focus on reproducing human data on semantic and visual processing tasks. We took a different approach and aimed to understand how dual hemispheres in a bilateral architecture interact to perform well in a given task. We propose a bilateral artificial neural network that imitates lateralisation observed in nature: that the left hemisphere specialises in local features and the right in global features. We used different training objectives to achieve the desired specialisation and tested it on an image classification task with two different CNN backbones: ResNet and VGG. Our analysis found that the hemispheres represent complementary features that are exploited by a network head that implements a type of weighted attention. The bilateral architecture outperformed a range of baselines of similar representational capacity that do not exploit differential specialisation, with the exception of a conventional ensemble of unilateral networks trained on dual training objectives for local and global features. The results demonstrate the efficacy of bilateralism, contribute to the discussion of bilateralism in biological brains, and the principle may serve as an inductive bias for new AI systems.
Artificial Intelligence (AI) is revolutionizing biodiversity research by enabling advanced data analysis, species identification, and habitats monitoring, thereby enhancing conservation efforts. Ensuring reproducibility in AI-driven biodiversity research is crucial for fostering transparency, verifying results, and promoting the credibility of ecological findings.This study investigates the reproducibility of deep learning (DL) methods within the biodiversity domain. We design a methodology for evaluating the reproducibility of biodiversity-related publications that employ DL techniques across three stages. We define ten variables essential for method reproducibility, divided into four categories: resource requirements, methodological information, uncontrolled randomness, and statistical considerations. These categories subsequently serve as the basis for defining different levels of reproducibility. We manually extract the availability of these variables from a curated dataset comprising 61 publications identified using the keywords provided by biodiversity experts. Our study shows that the dataset is shared in 47% of the publications; however, a significant number of the publications lack comprehensive information on deep learning methods, including details regarding randomness.
Count outcomes in longitudinal studies are frequent in clinical and engineering studies. In frequentist and Bayesian statistical analysis, methods such as Mixed linear models allow the variability or correlation within individuals to be taken into account. However, in more straightforward scenarios, where only two stages of an experiment are observed (pre-treatment vs. post-treatment), there are only a few tools available, mainly for continuous outcomes. Thus, this work introduces a Bayesian statistical methodology for comparing paired samples in binary pretest-posttest scenarios. We establish a Bayesian probabilistic model for the inferential analysis of the unknown quantities, which is validated and refined through simulation analyses, and present an application to a dataset taken from the Television School and Family Smoking Prevention and Cessation Project (TVSFP) (Flay et al., 1995). The application of the Full Bayesian Significance Test (FBST) for precise hypothesis testing, along with the implementation of adaptive significance levels in the decision-making process, is included.