This paper presents a scalable approximate Bayesian method for image restoration using total variation (TV) priors. In contrast to most optimization methods based on maximum a posteriori estimation, we use the expectation propagation (EP) framework to approximate minimum mean squared error (MMSE) estimators and marginal (pixel-wise) variances, without resorting to Monte Carlo sampling. For the classical anisotropic TV-based prior, we also propose an iterative scheme to automatically adjust the regularization parameter via expectation-maximization (EM). Using Gaussian approximating densities with diagonal covariance matrices, the resulting method allows highly parallelizable steps and can scale to large images for denoising, deconvolution and compressive sensing (CS) problems. The simulation results illustrate that such EP methods can provide a posteriori estimates on par with those obtained via sampling methods but at a fraction of the computational cost. Moreover, EP does not exhibit strong underestimation of posteriori variances, in contrast to variational Bayes alternatives.
We consider mixed finite element methods with exact symmetric stress tensors. We derive a new quasi-optimal a priori error estimate uniformly valid with respect to the compressibility. For the a posteriori error analysis we consider the Prager-Synge hypercircle principle and introduce a new estimate uniformly valid in the incompressible limit. All estimates are validated by numerical examples.
Replication analysis is widely used in many fields of study. Once a research is published, many other researchers will conduct the same or very similar analysis to confirm the reliability of the published research. However, what if the data is confidential? In particular, if the data sets used for the studies are confidential, we cannot release the results of replication analyses to any entity without the permission to access the data sets, otherwise it may result in serious privacy leakage especially when the published study and replication studies are using similar or common data sets. For example, examining the influence of the treatment on outliers can cause serious leakage of the information about outliers. In this paper, we build two frameworks for replication analysis by a differentially private Bayesian approach. We formalize our questions of interest and illustrates the properties of our methods by a combination of theoretical analysis and simulation to show the feasibility of our approach. We also provide some guidance on the choice of parameters and interpretation of the results.
We are interested in privatizing an approximate posterior inference algorithm called Expectation Propagation (EP). EP approximates the posterior by iteratively refining approximations to the local likelihoods, and is known to provide better posterior uncertainties than those by variational inference (VI). However, using EP for large-scale datasets imposes a challenge in terms of memory requirements as it needs to maintain each of the local approximates in memory. To overcome this problem, stochastic expectation propagation (SEP) was proposed, which only considers a unique local factor that captures the average effect of each likelihood term to the posterior and refines it in a way analogous to EP. In terms of privacy, SEP is more tractable than EP because at each refining step of a factor, the remaining factors are fixed to the same value and do not depend on other datapoints as in EP, which makes the sensitivity analysis tractable. We provide a theoretical analysis of the privacy-accuracy trade-off in the posterior estimates under differentially private stochastic expectation propagation (DP-SEP). Furthermore, we demonstrate the performance of our DP-SEP algorithm evaluated on both synthetic and real-world datasets in terms of the quality of posterior estimates at different levels of guaranteed privacy.
This study debuts a new spline dimensional decomposition (SDD) for uncertainty quantification analysis of high-dimensional functions, including those endowed with high nonlinearity and nonsmoothness, if they exist, in a proficient manner. The decomposition creates an hierarchical expansion for an output random variable of interest with respect to measure-consistent orthonormalized basis splines (B-splines) in independent input random variables. A dimensionwise decomposition of a spline space into orthogonal subspaces, each spanned by a reduced set of such orthonormal splines, results in SDD. Exploiting the modulus of smoothness, the SDD approximation is shown to converge in mean-square to the correct limit. The computational complexity of the SDD method is polynomial, as opposed to exponential, thus alleviating the curse of dimensionality to the extent possible. Analytical formulae are proposed to calculate the second-moment properties of a truncated SDD approximation for a general output random variable in terms of the expansion coefficients involved. Numerical results indicate that a low-order SDD approximation of nonsmooth functions calculates the probabilistic characteristics of an output variable with an accuracy matching or surpassing those obtained by high-order approximations from several existing methods. Finally, a 34-dimensional random eigenvalue analysis demonstrates the utility of SDD in solving practical problems.
We develop a generalized hybrid iterative approach for computing solutions to large-scale Bayesian inverse problems. We consider a hybrid algorithm based on the generalized Golub-Kahan bidiagonalization for computing Tikhonov regularized solutions to problems where explicit computation of the square root and inverse of the covariance kernel for the prior covariance matrix is not feasible. This is useful for large-scale problems where covariance kernels are defined on irregular grids or are only available via matrix-vector multiplication, e.g., those from the Mat\'{e}rn class. We show that iterates are equivalent to LSQR iterates applied to a directly regularized Tikhonov problem, after a transformation of variables, and we provide connections to a generalized singular value decomposition filtered solution. Our approach shares many benefits of standard hybrid methods such as avoiding semi-convergence and automatically estimating the regularization parameter. Numerical examples from image processing demonstrate the effectiveness of the described approaches.
This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs. First, we develop an asymmetric encoder-decoder architecture, with an encoder that operates only on the visible subset of patches (without mask tokens), along with a lightweight decoder that reconstructs the original image from the latent representation and mask tokens. Second, we find that masking a high proportion of the input image, e.g., 75%, yields a nontrivial and meaningful self-supervisory task. Coupling these two designs enables us to train large models efficiently and effectively: we accelerate training (by 3x or more) and improve accuracy. Our scalable approach allows for learning high-capacity models that generalize well: e.g., a vanilla ViT-Huge model achieves the best accuracy (87.8%) among methods that use only ImageNet-1K data. Transfer performance in downstream tasks outperforms supervised pre-training and shows promising scaling behavior.
This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.
The eigendeomposition of nearest-neighbor (NN) graph Laplacian matrices is the main computational bottleneck in spectral clustering. In this work, we introduce a highly-scalable, spectrum-preserving graph sparsification algorithm that enables to build ultra-sparse NN (u-NN) graphs with guaranteed preservation of the original graph spectrums, such as the first few eigenvectors of the original graph Laplacian. Our approach can immediately lead to scalable spectral clustering of large data networks without sacrificing solution quality. The proposed method starts from constructing low-stretch spanning trees (LSSTs) from the original graphs, which is followed by iteratively recovering small portions of "spectrally critical" off-tree edges to the LSSTs by leveraging a spectral off-tree embedding scheme. To determine the suitable amount of off-tree edges to be recovered to the LSSTs, an eigenvalue stability checking scheme is proposed, which enables to robustly preserve the first few Laplacian eigenvectors within the sparsified graph. Additionally, an incremental graph densification scheme is proposed for identifying extra edges that have been missing in the original NN graphs but can still play important roles in spectral clustering tasks. Our experimental results for a variety of well-known data sets show that the proposed method can dramatically reduce the complexity of NN graphs, leading to significant speedups in spectral clustering.
Data augmentation has been widely used for training deep learning systems for medical image segmentation and plays an important role in obtaining robust and transformation-invariant predictions. However, it has seldom been used at test time for segmentation and not been formulated in a consistent mathematical framework. In this paper, we first propose a theoretical formulation of test-time augmentation for deep learning in image recognition, where the prediction is obtained through estimating its expectation by Monte Carlo simulation with prior distributions of parameters in an image acquisition model that involves image transformations and noise. We then propose a novel uncertainty estimation method based on the formulated test-time augmentation. Experiments with segmentation of fetal brains and brain tumors from 2D and 3D Magnetic Resonance Images (MRI) showed that 1) our test-time augmentation outperforms a single-prediction baseline and dropout-based multiple predictions, and 2) it provides a better uncertainty estimation than calculating the model-based uncertainty alone and helps to reduce overconfident incorrect predictions.
In this paper, we propose an improved quantitative evaluation framework for Generative Adversarial Networks (GANs) on generating domain-specific images, where we improve conventional evaluation methods on two levels: the feature representation and the evaluation metric. Unlike most existing evaluation frameworks which transfer the representation of ImageNet inception model to map images onto the feature space, our framework uses a specialized encoder to acquire fine-grained domain-specific representation. Moreover, for datasets with multiple classes, we propose Class-Aware Frechet Distance (CAFD), which employs a Gaussian mixture model on the feature space to better fit the multi-manifold feature distribution. Experiments and analysis on both the feature level and the image level were conducted to demonstrate improvements of our proposed framework over the recently proposed state-of-the-art FID method. To our best knowledge, we are the first to provide counter examples where FID gives inconsistent results with human judgments. It is shown in the experiments that our framework is able to overcome the shortness of FID and improves robustness. Code will be made available.