It is highly desirable to know how uncertain a model's predictions are, especially for models that are complex and hard to understand as in deep learning. Although there has been a growing interest in using deep learning methods in diffusion-weighted MRI, prior works have not addressed the issue of model uncertainty. Here, we propose a deep learning method to estimate the diffusion tensor and compute the estimation uncertainty. Data-dependent uncertainty is computed directly by the network and learned via loss attenuation. Model uncertainty is computed using Monte Carlo dropout. We also propose a new method for evaluating the quality of predicted uncertainties. We compare the new method with the standard least-squares tensor estimation and bootstrap-based uncertainty computation techniques. Our experiments show that when the number of measurements is small the deep learning method is more accurate and its uncertainty predictions are better calibrated than the standard methods. We show that the estimation uncertainties computed by the new method can highlight the model's biases, detect domain shift, and reflect the strength of noise in the measurements. Our study shows the importance and practical value of modeling prediction uncertainties in deep learning-based diffusion MRI analysis.
Research on machine learning for channel estimation, especially neural network solutions for wireless communications, is attracting significant current interest. This is because conventional methods cannot meet the present demands of the high speed communication. In the paper, we deploy a general residual convolutional neural network to achieve channel estimation for the orthogonal frequency-division multiplexing (OFDM) signals in a downlink scenario. Our method also deploys a simple interpolation layer to replace the transposed convolutional layer used in other networks to reduce the computation cost. The proposed method is more easily adapted to different pilot patterns and packet sizes. Compared with other deep learning methods for channel estimation, our results for 3GPP channel models suggest improved mean squared error performance for our approach.
We study the shape reconstruction of an inclusion from the {faraway} measurement of the associated electric field. This is an inverse problem of practical importance in biomedical imaging and is known to be notoriously ill-posed. By incorporating Drude's model of the permittivity parameter, we propose a novel reconstruction scheme by using the plasmon resonance with a significantly enhanced resonant field. We conduct a delicate sensitivity analysis to establish a sharp relationship between the sensitivity of the reconstruction and the plasmon resonance. It is shown that when plasmon resonance occurs, the sensitivity functional blows up and hence ensures a more robust and effective construction. Then we combine the Tikhonov regularization with the Laplace approximation to solve the inverse problem, which is an organic hybridization of the deterministic and stochastic methods and can quickly calculate the minimizer while capture the uncertainty of the solution. We conduct extensive numerical experiments to illustrate the promising features of the proposed reconstruction scheme.
The saddlepoint approximation gives an approximation to the density of a random variable in terms of its moment generating function. When the underlying random variable is itself the sum of $n$ unobserved i.i.d. terms, the basic classical result is that the relative error in the density is of order $1/n$. If instead the approximation is interpreted as a likelihood and maximised as a function of model parameters, the result is an approximation to the maximum likelihood estimate (MLE) that can be much faster to compute than the true MLE. This paper proves the analogous basic result for the approximation error between the saddlepoint MLE and the true MLE: subject to certain explicit identifiability conditions, the error has asymptotic size $O(1/n^2)$ for some parameters, and $O(1/n^{3/2})$ or $O(1/n)$ for others. In all three cases, the approximation errors are asymptotically negligible compared to the inferential uncertainty. The proof is based on a factorisation of the saddlepoint likelihood into an exact and approximate term, along with an analysis of the approximation error in the gradient of the log-likelihood. This factorisation also gives insight into alternatives to the saddlepoint approximation, including a new and simpler saddlepoint approximation, for which we derive analogous error bounds. As a corollary of our results, we also obtain the asymptotic size of the MLE error approximation when the saddlepoint approximation is replaced by the normal approximation.
This paper presents an algorithm for iterative joint channel parameter (carrier phase, Doppler shift and Doppler rate) estimation and decoding of transmission over channels affected by Doppler shift and Doppler rate using a distributed receiver. This algorithm is derived by applying the sum-product algorithm (SPA) to a factor graph representing the joint a posteriori distribution of the information symbols and channel parameters given the channel output. In this paper, we present two methods for dealing with intractable messages of the sum-product algorithm. In the first approach, we use particle filtering with sequential importance sampling (SIS) for the estimation of the unknown parameters. We also propose a method for fine-tuning of particles for improved convergence. In the second approach, we approximate our model with a random walk phase model, followed by a phase tracking algorithm and polynomial regression algorithm to estimate the unknown parameters. We derive the Weighted Bayesian Cramer-Rao Bounds (WBCRBs) for joint carrier phase, Doppler shift and Doppler rate estimation, which take into account the prior distribution of the estimation parameters and are accurate lower bounds for all considered Signal to Noise Ratio (SNR) values. Numerical results (of bit error rate (BER) and the mean-square error (MSE) of parameter estimation) suggest that phase tracking with the random walk model slightly outperforms particle filtering. However, particle filtering has a lower computational cost than the random walk model based method.
We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. We recast the $Q$-function estimation into a special form of the nonparametric instrumental variables (NPIV) estimation problem. We first show that under one mild condition the NPIV formulation of $Q$-function estimation is well-posed in the sense of $L^2$-measure of ill-posedness with respect to the data generating distribution, bypassing a strong assumption on the discount factor $\gamma$ imposed in the recent literature for obtaining the $L^2$ convergence rates of various $Q$-function estimators. Thanks to this new well-posed property, we derive the first minimax lower bounds for the convergence rates of nonparametric estimation of $Q$-function and its derivatives in both sup-norm and $L^2$-norm, which are shown to be the same as those for the classical nonparametric regression (Stone, 1982). We then propose a sieve two-stage least squares estimator and establish its rate-optimality in both norms under some mild conditions. Our general results on the well-posedness and the minimax lower bounds are of independent interest to study not only other nonparametric estimators for $Q$-function but also efficient estimation on the value of any target policy in off-policy settings.
In countries where population census and sample survey data are limited, generating accurate subnational estimates of health and demographic indicators is challenging. Existing model-based geostatistical methods leverage covariate information and spatial smoothing to reduce the variability of estimates but often assume the survey design is ignorable, which may be inappropriate given the complex design of household surveys typically used in this context. On the other hand, small area estimation approaches common in the survey statistics literature do not incorporate both unit-level covariate information and spatial smoothing in a design-consistent way. We propose a new smoothed model-assisted estimator that accounts for survey design and leverages both unit-level covariates and spatial smoothing, bridging the survey statistics and model-based geostatistics perspectives. Under certain assumptions, the new estimator can be viewed as both design-consistent and model-consistent, offering potential benefits from both perspectives. We demonstrate our estimator's performance using both real and simulated data, comparing it with existing design-based and model-based estimators.
Continuous determinantal point processes (DPPs) are a class of repulsive point processes on $\mathbb{R}^d$ with many statistical applications. Although an explicit expression of their density is known, it is too complicated to be used directly for maximum likelihood estimation. In the stationary case, an approximation using Fourier series has been suggested, but it is limited to rectangular observation windows and no theoretical results support it. In this contribution, we investigate a different way to approximate the likelihood by looking at its asymptotic behaviour when the observation window grows towards $\mathbb{R}^d$. This new approximation is not limited to rectangular windows, is faster to compute than the previous one, does not require any tuning parameter, and some theoretical justifications are provided. It moreover provides an explicit formula for estimating the asymptotic variance of the associated estimator. The performances are assessed in a simulation study on standard parametric models on $\mathbb{R}^d$ and compare favourably to common alternative estimation methods for continuous DPPs.
The additive hazards model specifies the effect of covariates on the hazard in an additive way, in contrast to the popular Cox model, in which it is multiplicative. As non-parametric model, it offers a very flexible way of modeling time-varying covariate effects. It is most commonly estimated by ordinary least squares. In this paper we consider the case where covariates are bounded, and derive the maximum likelihood estimator under the constraint that the hazard is non-negative for all covariate values in their domain. We describe an efficient algorithm to find the maximum likelihood estimator. The method is contrasted with the ordinary least squares approach in a simulation study, and the method is illustrated on a realistic data set.
Data augmentation has been widely used for training deep learning systems for medical image segmentation and plays an important role in obtaining robust and transformation-invariant predictions. However, it has seldom been used at test time for segmentation and not been formulated in a consistent mathematical framework. In this paper, we first propose a theoretical formulation of test-time augmentation for deep learning in image recognition, where the prediction is obtained through estimating its expectation by Monte Carlo simulation with prior distributions of parameters in an image acquisition model that involves image transformations and noise. We then propose a novel uncertainty estimation method based on the formulated test-time augmentation. Experiments with segmentation of fetal brains and brain tumors from 2D and 3D Magnetic Resonance Images (MRI) showed that 1) our test-time augmentation outperforms a single-prediction baseline and dropout-based multiple predictions, and 2) it provides a better uncertainty estimation than calculating the model-based uncertainty alone and helps to reduce overconfident incorrect predictions.
Detecting objects and estimating their pose remains as one of the major challenges of the computer vision research community. There exists a compromise between localizing the objects and estimating their viewpoints. The detector ideally needs to be view-invariant, while the pose estimation process should be able to generalize towards the category-level. This work is an exploration of using deep learning models for solving both problems simultaneously. For doing so, we propose three novel deep learning architectures, which are able to perform a joint detection and pose estimation, where we gradually decouple the two tasks. We also investigate whether the pose estimation problem should be solved as a classification or regression problem, being this still an open question in the computer vision community. We detail a comparative analysis of all our solutions and the methods that currently define the state of the art for this problem. We use PASCAL3D+ and ObjectNet3D datasets to present the thorough experimental evaluation and main results. With the proposed models we achieve the state-of-the-art performance in both datasets.