Robotic assistance for experimental manipulation in the life sciences is expected to enable favorable outcomes, regardless of the skill of the scientist. Experimental specimens in the life sciences are subject to individual variability hence require intricate algorithms for successful autonomous robotic control. As a use case, we are studying the creation of cranial windows in mice. This operation requires the removal of an 8-mm-circular patch of the skull, which is approximately 300 um thick, but the shape and thickness of the mouse skull significantly varies depending on the strain of mouse, sex, and age. In this work, we propose an autonomous robotic drilling method with no offline planning, consisting of a trajectory planning block with execution-time feedback with completion level recognition based on image and force information. The force information allows for completion-level resolution to increase 10 fold. We evaluate the proposed method in two ways. First, in an eggshell drilling task and achieved a success rate of 95% and average drilling time of 7.1 min out of 20 trials. Second, in postmortem mice and with a success rate of 70% and average drilling time of 9.3 min out of 20 trials.
Test smells are coding issues that typically arise from inadequate practices, a lack of knowledge about effective testing, or deadline pressures to complete projects. The presence of test smells can negatively impact the maintainability and reliability of software. While there are tools that use advanced static analysis or machine learning techniques to detect test smells, these tools often require effort to be used. This study aims to evaluate the capability of Large Language Models (LLMs) in automatically detecting test smells. We evaluated ChatGPT-4, Mistral Large, and Gemini Advanced using 30 types of test smells across codebases in seven different programming languages collected from the literature. ChatGPT-4 identified 21 types of test smells. Gemini Advanced identified 17 types, while Mistral Large detected 15 types of test smells. Conclusion: The LLMs demonstrated potential as a valuable tool in identifying test smells.
The Kalman filter (KF) is a state estimation algorithm that optimally combines system knowledge and measurements to minimize the mean squared error of the estimated states. While KF was initially designed for linear systems, numerous extensions of it, such as extended Kalman filter (EKF), unscented Kalman filter (UKF), cubature Kalman filter (CKF), etc., have been proposed for nonlinear systems. Although different types of nonlinear KFs have different pros and cons, they all use the same framework of linear KF, which, according to what we found in this paper, tends to give overconfident and less accurate state estimations when the measurement functions are nonlinear. Therefore, in this study, we designed a new framework for nonlinear KFs and showed theoretically and empirically that the new framework estimates the states and covariance matrix more accurately than the old one. The new framework was tested on four different nonlinear KFs and five different tasks, showcasing its ability to reduce the estimation errors by several orders of magnitude in low-measurement-noise conditions, with only about a 10 to 90% increase in computational time. All types of nonlinear KFs can benefit from the new framework, and the benefit will increase as the sensors become more and more accurate in the future. As an example, EKF, the simplest nonlinear KF that was previously believed to work poorly for strongly nonlinear systems, can now provide fast and fairly accurate state estimations with the help of the new framework. The codes are available at //github.com/Shida-Jiang/A-new-framework-for-nonlinear-Kalman-filters.
Image segmentation relies heavily on neural networks which are known to be overconfident, especially when making predictions on out-of-distribution (OOD) images. This is a common scenario in the medical domain due to variations in equipment, acquisition sites, or image corruptions. This work addresses the challenge of OOD detection by proposing Laplacian Segmentation Networks (LSN): methods which jointly model epistemic (model) and aleatoric (data) uncertainty for OOD detection. In doing so, we propose the first Laplace approximation of the weight posterior that scales to large neural networks with skip connections that have high-dimensional outputs. We demonstrate on three datasets that the LSN-modeled parameter distributions, in combination with suitable uncertainty measures, gives superior OOD detection.
Knowledge Graph Embedding (KGE) is a popular approach, which aims to represent entities and relations of a knowledge graph in latent spaces. Their representations are known as embeddings. To measure the plausibility of triplets, score functions are defined over embedding spaces. Despite wide dissemination of KGE in various tasks, KGE methods have limitations in reasoning abilities. In this paper we propose a mathematical framework to compare reasoning abilities of KGE methods. We show that STransE has a higher capability than TransComplEx, and then present new STransCoRe method, which improves the STransE by combining it with the TransCoRe insights, which can reduce the STransE space complexity.
Detecting the anomaly of human behavior is paramount to timely recognizing endangering situations, such as street fights or elderly falls. However, anomaly detection is complex since anomalous events are rare and because it is an open set recognition task, i.e., what is anomalous at inference has not been observed at training. We propose COSKAD, a novel model that encodes skeletal human motion by a graph convolutional network and learns to COntract SKeletal kinematic embeddings onto a latent hypersphere of minimum volume for Video Anomaly Detection. We propose three latent spaces: the commonly-adopted Euclidean and the novel spherical and hyperbolic. All variants outperform the state-of-the-art on the most recent UBnormal dataset, for which we contribute a human-related version with annotated skeletons. COSKAD sets a new state-of-the-art on the human-related versions of ShanghaiTech Campus and CUHK Avenue, with performance comparable to video-based methods. Source code and dataset will be released upon acceptance.
A mechanism is described that addresses the fundamental trade off between media producers who want to increase reach and consumers who provide attention based on the rate of utility received, and where overreach negatively impacts that rate. An optimal solution can be achieved when the media source considers the impact of overreach in a cost function used in determining the optimal distribution of content to maximize individual consumer utility and participation. The result is a Nash equilibrium between producer and consumer that is also Pareto efficient. Comparison with the literature on Recommender systems highlights the advantages of the mechanism, including identifying an optimal content volume for the consumer and improvements for optimizing with multiple objectives. A practical algorithm for generating the optimal distribution for each consumer is provided.
Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.
Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.
Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).
The amount of publicly available biomedical literature has been growing rapidly in recent years, yet question answering systems still struggle to exploit the full potential of this source of data. In a preliminary processing step, many question answering systems rely on retrieval models for identifying relevant documents and passages. This paper proposes a weighted cosine distance retrieval scheme based on neural network word embeddings. Our experiments are based on publicly available data and tasks from the BioASQ biomedical question answering challenge and demonstrate significant performance gains over a wide range of state-of-the-art models.