Cardiac magnetic resonance (CMR) sequences visualise the cardiac function voxel-wise over time. Simultaneously, deep learning-based deformable image registration is able to estimate discrete vector fields which warp one time step of a CMR sequence to the following in a self-supervised manner. However, despite the rich source of information included in these 3D+t vector fields, a standardised interpretation is challenging and the clinical applications remain limited so far. In this work, we show how to efficiently use a deformable vector field to describe the underlying dynamic process of a cardiac cycle in form of a derived 1D motion descriptor. Additionally, based on the expected cardiovascular physiological properties of a contracting or relaxing ventricle, we define a set of rules that enables the identification of five cardiovascular phases including the end-systole (ES) and end-diastole (ED) without the usage of labels. We evaluate the plausibility of the motion descriptor on two challenging multi-disease, -center, -scanner short-axis CMR datasets. First, by reporting quantitative measures such as the periodic frame difference for the extracted phases. Second, by comparing qualitatively the general pattern when we temporally resample and align the motion descriptors of all instances across both datasets. The average periodic frame difference for the ED, ES key phases of our approach is $0.80\pm{0.85}$, $0.69\pm{0.79}$ which is slightly better than the inter-observer variability ($1.07\pm{0.86}$, $0.91\pm{1.6}$) and the supervised baseline method ($1.18\pm{1.91}$, $1.21\pm{1.78}$). Code and labels will be made available on our GitHub repository. //github.com/Cardio-AI/cmr-phase-detection
We consider the problem of finding the matching map between two sets of $d$-dimensional noisy feature-vectors. The distinctive feature of our setting is that we do not assume that all the vectors of the first set have their corresponding vector in the second set. If $n$ and $m$ are the sizes of these two sets, we assume that the matching map that should be recovered is defined on a subset of unknown cardinality $k^*\le \min(n,m)$. We show that, in the high-dimensional setting, if the signal-to-noise ratio is larger than $5(d\log(4nm/\alpha))^{1/4}$, then the true matching map can be recovered with probability $1-\alpha$. Interestingly, this threshold does not depend on $k^*$ and is the same as the one obtained in prior work in the case of $k = \min(n,m)$. The procedure for which the aforementioned property is proved is obtained by a data-driven selection among candidate mappings $\{\hat\pi_k:k\in[\min(n,m)]\}$. Each $\hat\pi_k$ minimizes the sum of squares of distances between two sets of size $k$. The resulting optimization problem can be formulated as a minimum-cost flow problem, and thus solved efficiently. Finally, we report the results of numerical experiments on both synthetic and real-world data that illustrate our theoretical results and provide further insight into the properties of the algorithms studied in this work.
We propose coordinating guiding vector fields to achieve two tasks simultaneously with a team of robots: first, the guidance and navigation of multiple robots to possibly different paths or surfaces typically embedded in 2D or 3D; second, their motion coordination while tracking their prescribed paths or surfaces. The motion coordination is defined by desired parametric displacements between robots on the path or surface. Such a desired displacement is achieved by controlling the virtual coordinates, which correspond to the path or surface's parameters, between guiding vector fields. Rigorous mathematical guarantees underpinned by dynamical systems theory and Lyapunov theory are provided for the effective distributed motion coordination and navigation of robots on paths or surfaces from all initial positions. As an example for practical robotic applications, we derive a control algorithm from the proposed coordinating guiding vector fields for a Dubins-car-like model with actuation saturation. Our proposed algorithm is distributed and scalable to an arbitrary number of robots. Furthermore, extensive illustrative simulations and fixed-wing aircraft outdoor experiments validate the effectiveness and robustness of our algorithm.
Recently, the availability of remote sensing imagery from aerial vehicles and satellites constantly improved. For an automated interpretation of such data, deep-learning-based object detectors achieve state-of-the-art performance. However, established object detectors require complete, precise, and correct bounding box annotations for training. In order to create the necessary training annotations for object detectors, imagery can be georeferenced and combined with data from other sources, such as points of interest localized by GPS sensors. Unfortunately, this combination often leads to poor object localization and missing annotations. Therefore, training object detectors with such data often results in insufficient detection performance. In this paper, we present a novel approach for training object detectors with extremely noisy and incomplete annotations. Our method is based on a teacher-student learning framework and a correction module accounting for imprecise and missing annotations. Thus, our method is easy to use and can be combined with arbitrary object detectors. We demonstrate that our approach improves standard detectors by 37.1\% $AP_{50}$ on a noisy real-world remote-sensing dataset. Furthermore, our method achieves great performance gains on two datasets with synthetic noise. Code is available at \url{//github.com/mxbh/robust_object_detection}.
Neural Radiance Field (NeRF) has widely received attention in Sparse-View (SV) CT reconstruction problems as a self-supervised deep learning framework. NeRF-based SVCT methods model the desired CT image as a continuous function that maps coordinates to intensities and then train a Multi-Layer Perceptron (MLP) to learn the function by minimizing loss on the SV measurement. Thanks to the continuous representation provided by NeRF, the function can be approximated well and thus the high-quality CT image is reconstructed. However, existing NeRF-based SVCT methods strictly suppose there is completely no relative motion during the CT acquisition because they require accurate projection poses to simulate the X-rays that scan the SV sinogram. Therefore, these methods suffer from severe performance drops for real SVCT imaging with motion. To this end, this work proposes a self-calibrating neural field that recovers the artifacts-free image from the rigid motion-corrupted SV measurement without using any external data. Specifically, we parametrize the coarse projection poses caused by rigid motion as trainable variables and then jointly optimize these variables and the MLP. We perform numerical experiments on a public COVID-19 CT dataset. The results indicate that our model significantly outperforms two latest NeRF-based methods for SVCT reconstruction with four different levels of rigid motion.
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task. This paper shows that in such settings, selectively fine-tuning a subset of layers (which we term surgical fine-tuning) matches or outperforms commonly used fine-tuning approaches. Moreover, the type of distribution shift influences which subset is more effective to tune: for example, for image corruptions, fine-tuning only the first few layers works best. We validate our findings systematically across seven real-world data tasks spanning three types of distribution shifts. Theoretically, we prove that for two-layer neural networks in an idealized setting, first-layer tuning can outperform fine-tuning all layers. Intuitively, fine-tuning more parameters on a small target dataset can cause information learned during pre-training to be forgotten, and the relevant information depends on the type of shift.
Wireless sensor networks are among the most promising technologies of the current era because of their small size, lower cost, and ease of deployment. With the increasing number of wireless sensors, the probability of generating missing data also rises. This incomplete data could lead to disastrous consequences if used for decision-making. There is rich literature dealing with this problem. However, most approaches show performance degradation when a sizable amount of data is lost. Inspired by the emerging field of graph signal processing, this paper performs a new study of a Sobolev reconstruction algorithm in wireless sensor networks. Experimental comparisons on several publicly available datasets demonstrate that the algorithm surpasses multiple state-of-the-art techniques by a maximum margin of 54%. We further show that this algorithm consistently retrieves the missing data even during massive data loss situations.
The mean field variational inference (MFVI) formulation restricts the general Bayesian inference problem to the subspace of product measures. We present a framework to analyze MFVI algorithms, which is inspired by a similar development for general variational Bayesian formulations. Our approach enables the MFVI problem to be represented in three different manners: a gradient flow on Wasserstein space, a system of Fokker-Planck-like equations and a diffusion process. Rigorous guarantees are established to show that a time-discretized implementation of the coordinate ascent variational inference algorithm in the product Wasserstein space of measures yields a gradient flow in the limit. A similar result is obtained for their associated densities, with the limit being given by a quasi-linear partial differential equation. A popular class of practical algorithms falls in this framework, which provides tools to establish convergence. We hope this framework could be used to guarantee convergence of algorithms in a variety of approaches, old and new, to solve variational inference problems.
Shoe tread impressions are one of the most common types of evidence left at crime scenes. However, the utility of such evidence is limited by the lack of databases of footwear prints that cover the large and growing number of distinct shoe models. Moreover, the database is preferred to contain the 3D shape, or depth, of shoe-tread photos so as to allow for extracting shoeprints to match a query (crime-scene) print. We propose to address this gap by leveraging shoe-tread photos collected by online retailers. The core challenge is to predict depth maps for these photos. As they do not have ground-truth 3D shapes allowing for training depth predictors, we exploit synthetic data that does. We develop a method termed ShoeRinsics that learns to predict depth by leveraging a mix of fully supervised synthetic data and unsupervised retail image data. In particular, we find domain adaptation and intrinsic image decomposition techniques effectively mitigate the synthetic-real domain gap and yield significantly better depth prediction. To validate our method, we introduce 2 validation sets consisting of shoe-tread image and print pairs and define a benchmarking protocol to quantify the quality of predicted depth. On this benchmark, ShoeRinsics outperforms existing methods of depth prediction and synthetic-to-real domain adaptation.
Variable selection is crucial for sparse modeling in this age of big data. Missing values are common in data, and make variable selection more complicated. The approach of multiple imputation (MI) results in multiply imputed datasets for missing values, and has been widely applied in various variable selection procedures. However, directly performing variable selection on the whole MI data or bootstrapped MI data may not be worthy in terms of computation cost. To fast identify the active variables in the linear regression model, we propose the adaptive grafting procedure with three pooling rules on MI data. The proposed methods proceed iteratively, which starts from finding the active variables based on the complete case subset and then expand the working data matrix with both the number of active variables and available observations. A comprehensive simulation study shows the selection accuracy in different aspects and computational efficiency of the proposed methods. Two real-life examples illustrate the strength of the proposed methods.
While consolidation strategies form the backbone of many supply chain optimisation problems, exploitation of multi-tier material relationships through consolidation remains an understudied area, despite being a prominent feature of industries that produce complex made-to-order products. In this paper, we propose an optimisation framework for exploiting multi-to-multi relationship between tiers of a supply chain. The resulting formulation is flexible such that quantity discounts, inventory holding and transport costs can be included. The framework introduces a new trade-off between the tiers, resulting in cost reductions at one tier at the expense of increased costs in the other tier, which helps to reduce the overall procurement cost in the supply chain. A mixed integer linear programming model is developed and tested with a range of small to large-scale test problems from aerospace manufacturing. Our comparison to benchmark results show that there is indeed a cost trade-off between two tiers, and that its reduction can be achieved using a holistic approach to reconfiguration. Costs are decreased when second tier fixed ordering costs and the number of machining options increase. Consolidation results in less inventory holding costs for all cases. A number of secondary effects such as simplified supplier selection may also be observed.