This Point spread function (PSF) plays a crucial role in many computational imaging applications, such as shape from focus/defocus, depth estimation, and fluorescence microscopy. However, the mathematical model of the defocus process is still unclear. In this work, we develop an alternative method to estimate the precise mathematical model of the point spread function to describe the defocus process. We first derive the mathematical algorithm for the PSF which is used to generate the simulated focused images for different focus depth. Then we compute the loss function of the similarity between the simulated focused images and real focused images where we design a novel and efficient metric based on the defocus histogram to evaluate the difference between the focused images. After we solve the minimum value of the loss function, it means we find the optimal parameters for the PSF. We also construct a hardware system consisting of a focusing system and a structured light system to acquire the all-in-focus image, the focused image with corresponding focus depth, and the depth map in the same view. The three types of images, as a dataset, are used to obtain the precise PSF. Our experiments on standard planes and actual objects show that the proposed algorithm can accurately describe the defocus process. The accuracy of our algorithm is further proved by evaluating the difference among the actual focused images, the focused image generated by our algorithm, the focused image generated by others. The results show that the loss of our algorithm is 40% less than others on average.
We present a novel monocular localization framework by jointly training deep learning-based depth prediction and Bayesian filtering-based pose reasoning. The proposed cross-modal framework significantly outperforms deep learning-only predictions with respect to model scalability and tolerance to environmental variations. Specifically, we show little-to-no degradation of pose accuracy even with extremely poor depth estimates from a lightweight depth predictor. Our framework also maintains high pose accuracy in extreme lighting variations compared to standard deep learning, even without explicit domain adaptation. By openly representing the map and intermediate feature maps (such as depth estimates), our framework also allows for faster updates and reusing intermediate predictions for other tasks, such as obstacle avoidance, resulting in much higher resource efficiency.
The miniaturization of inertial measurement units (IMUs) facilitates their widespread use in a growing number of application domains. Orientation estimation is a prerequisite for most further data processing steps in inertial motion tracking, such as position/velocity estimation, joint angle estimation, and 3D visualization. Errors in the estimated orientations severely affect all further processing steps. Recent systematic comparisons of existing algorithms show that out-of-the-box accuracy is often low and that application-specific tuning is required to obtain high accuracy. In the present work, we propose and extensively evaluate a quaternion-based orientation estimation algorithm that is based on a novel approach of filtering the acceleration measurements in an almost-inertial frame and that includes extensions for gyroscope bias estimation and magnetic disturbance rejection, as well as a variant for offline data processing. In contrast to all existing work, we perform an extensive evaluation, using a large collection of publicly available datasets and eight literature methods for comparison. The proposed method consistently outperforms all literature methods and achieves an average RMSE of 2.9{\deg}, while the errors obtained with literature methods range from 5.3{\deg} to 16.7{\deg}. Since the evaluation was performed with one single fixed parametrization across a very diverse dataset collection, we conclude that the proposed method provides unprecedented out-of-the-box performance for a broad range of motions, sensor hardware, and environmental conditions. This gain in orientation estimation accuracy is expected to advance the field of IMU-based motion analysis and provide performance benefits in numerous applications. The provided open-source implementation makes it easy to employ the proposed method.
Since batch algorithms suffer from lack of proficiency in confronting model mismatches and disturbances, this contribution proposes an adaptive scheme based on continuous Lyapunov function for online robot dynamic identification. This paper suggests stable updating rules to drive neural networks inspiring from model reference adaptive paradigm. Network structure consists of three parallel self-driving neural networks which aim to estimate robot dynamic terms individually. Lyapunov candidate is selected to construct energy surface for a convex optimization framework. Learning rules are driven directly from Lyapunov functions to make the derivative negative. Finally, experimental results on 3-DOF Phantom Omni Haptic device demonstrate efficiency of the proposed method.
Continuum robots have the potential to enable new applications in medicine, inspection, and countless other areas due to their unique shape, compliance, and size. Excellent progess has been made in the mechanical design and dynamic modelling of continuum robots, to the point that there are some canonical designs, although new concepts continue to be explored. In this paper, we turn to the problem of state estimation for continuum robots that can been modelled with the common Cosserat rod model. Sensing for continuum robots might comprise external camera observations, embedded tracking coils or strain gauges. We repurpose a Gaussian process (GP) regression approach to state estimation, initially developed for continuous-time trajectory estimation in $SE(3)$. In our case, the continuous variable is not time but arclength and we show how to estimate the continuous shape (and strain) of the robot (along with associated uncertainties) given discrete, noisy measurements of both pose and strain along the length. We demonstrate our approach quantitatively through simulations as well as through experiments. Our evaluations show that accurate and continuous estimates of a continuum robot's shape can be achieved, resulting in average end-effector errors between the estimated and ground truth shape as low as 3.5mm and 0.016$^\circ$ in simulation or 3.3mm and 0.035$^\circ$ for unloaded configurations and 6.2mm and 0.041$^\circ$ for loaded ones during experiments, when using discrete pose measurements.
The high dimensionality of hyperspectral images consisting of several bands often imposes a big computational challenge for image processing. Therefore, spectral band selection is an essential step for removing the irrelevant, noisy and redundant bands. Consequently increasing the classification accuracy. However, identification of useful bands from hundreds or even thousands of related bands is a nontrivial task. This paper aims at identifying a small set of highly discriminative bands, for improving computational speed and prediction accuracy. Hence, we proposed a new strategy based on joint mutual information to measure the statistical dependence and correlation between the selected bands and evaluate the relative utility of each one to classification. The proposed filter approach is compared to an effective reproduced filters based on mutual information. Simulations results on the hyperpectral image HSI AVIRIS 92AV3C using the SVM classifier have shown that the effective proposed algorithm outperforms the reproduced filters strategy performance. Keywords-Hyperspectral images, Classification, band Selection, Joint Mutual Information, dimensionality reduction ,correlation, SVM.
In this paper, we study the infinite-depth limit of finite-width residual neural networks with random Gaussian weights. With proper scaling, we show that by fixing the width and taking the depth to infinity, the pre-activations converge in distribution to a zero-drift diffusion process. Unlike the infinite-width limit where the pre-activation converge weakly to a Gaussian random variable, we show that the infinite-depth limit yields different distributions depending on the choice of the activation function. We document two cases where these distributions have closed-form (different) expressions. We further show an intriguing change of regime phenomenon of the post-activation norms when the width increases from 3 to 4. Lastly, we study the sequential limit infinite-depth-then-infinite-width and compare it with the more commonly studied infinite-width-then-infinite-depth limit.
Numerous studies have examined the associations between long-term exposure to fine particulate matter (PM2.5) and adverse health outcomes. Recently, many of these studies have begun to employ high-resolution predicted PM2.5 concentrations, which are subject to measurement error. Previous approaches for exposure measurement error correction have either been applied in non-causal settings or have only considered a categorical exposure. Moreover, most procedures have failed to account for uncertainty induced by error correction when fitting an exposure-response function (ERF). To remedy these deficiencies, we develop a multiple imputation framework that combines regression calibration and Bayesian techniques to estimate a causal ERF. We demonstrate how the output of the measurement error correction steps can be seamlessly integrated into a Bayesian additive regression trees (BART) estimator of the causal ERF. We also demonstrate how locally-weighted smoothing of the posterior samples from BART can be used to create a more accurate ERF estimate. Our proposed approach also properly propagates the exposure measurement error uncertainty to yield accurate standard error estimates. We assess the robustness of our proposed approach in an extensive simulation study. We then apply our methodology to estimate the effects of PM2.5 on all-cause mortality among Medicare enrollees in New England from 2000-2012.
In this work, we present a method for synthetic CT (sCT) generation from zero-echo-time (ZTE) MRI aimed at structural and quantitative accuracies of the image, with a particular focus on the accurate bone density value prediction. We propose a loss function that favors a spatially sparse region in the image. We harness the ability of a multi-task network to produce correlated outputs as a framework to enable localisation of region of interest (RoI) via classification, emphasize regression of values within RoI and still retain the overall accuracy via global regression. The network is optimized by a composite loss function that combines a dedicated loss from each task. We demonstrate how the multi-task network with RoI focused loss offers an advantage over other configurations of the network to achieve higher accuracy of performance. This is relevant to sCT where failure to accurately estimate high Hounsfield Unit values of bone could lead to impaired accuracy in clinical applications. We compare the dose calculation maps from the proposed sCT and the real CT in a radiation therapy treatment planning setup.
This work is concerned with epidemiological models defined on networks, which highlight the prominent role of the social contact network of a given population in the spread of infectious diseases. In particular, we address the modelling and analysis of very large networks. As a basic epidemiological model, we focus on a SEIR (Susceptible-Exposed-Infective-Removed) model governing the behaviour of infectious disease among a population of individuals, which is partitioned into sub-populations. We study the long-time behaviour of the dynamic for this model that considers the heterogeneity of the infections and the social network. By relying on the theory of graphons we explore the natural question of the large population limit and investigate the behaviour of the model as the size of the network tends to infinity. After establishing the existence and uniqueness of solutions to the models that we will consider, we discuss the possibility of using the graphon-based limit model as a generative model for a network with particular statistical properties relating to the distribution of connections. We also provide some preliminary numerical tests.
Human head pose estimation is an essential problem in facial analysis in recent years that has a lot of computer vision applications such as gaze estimation, virtual reality, and driver assistance. Because of the importance of the head pose estimation problem, it is necessary to design a compact model to resolve this task in order to reduce the computational cost when deploying on facial analysis-based applications such as large camera surveillance systems, AI cameras while maintaining accuracy. In this work, we propose a lightweight model that effectively addresses the head pose estimation problem. Our approach has two main steps. 1) We first train many teacher models on the synthesis dataset - 300W-LPA to get the head pose pseudo labels. 2) We design an architecture with the ResNet18 backbone and train our proposed model with the ensemble of these pseudo labels via the knowledge distillation process. To evaluate the effectiveness of our model, we use AFLW-2000 and BIWI - two real-world head pose datasets. Experimental results show that our proposed model significantly improves the accuracy in comparison with the state-of-the-art head pose estimation methods. Furthermore, our model has the real-time speed of $\sim$300 FPS when inferring on Tesla V100.