Polar slice sampling, a Markov chain construction for approximate sampling, performs, under suitable assumptions on the target and initial distribution, provably independent of the state space dimension. We extend the aforementioned result of Roberts & Rosenthal (2002) by developing a theory which identifies conditions, in terms of a generalized level set function, that imply an explicit lower bound on the spectral gap even in a general slice sampling context. Verifying the identified conditions for polar slice sampling yields a lower bound of 1/2 on the spectral gap for arbitrary dimension if the target density is rotationally invariant, log-concave along rays emanating from the origin and sufficiently smooth. The general theoretical result is potentially applicable beyond the polar slice sampling framework.
Many asymptotically minimax procedures for function estimation often rely on somewhat arbitrary and restrictive assumptions such as isotropy or spatial homogeneity. This work enhances the theoretical understanding of Bayesian additive regression trees under substantially relaxed smoothness assumptions. We provide a comprehensive study of asymptotic optimality and posterior contraction of Bayesian forests when the regression function has anisotropic smoothness that possibly varies over the function domain. The regression function can also be possibly discontinuous. We introduce a new class of sparse {\em piecewise heterogeneous anisotropic} H\"{o}lder functions and derive their minimax lower bound of estimation in high-dimensional scenarios under the $L_2$-loss. We then find that the Bayesian tree priors, coupled with a Dirichlet subset selection prior for sparse estimation in high-dimensional scenarios, adapt to unknown heterogeneous smoothness, discontinuity, and sparsity. These results show that Bayesian forests are uniquely suited for more general estimation problems that would render other default machine learning tools, such as Gaussian processes, suboptimal. Our numerical study shows that Bayesian forests often outperform other competitors such as random forests and deep neural networks, which are believed to work well for discontinuous or complicated smooth functions. Beyond nonparametric regression, we also examined posterior contraction of Bayesian forests for density estimation and binary classification using the technique developed in this study.
Understanding whether and how treatment effects vary across subgroups is crucial to inform clinical practice and recommendations. Accordingly, the assessment of heterogeneous treatment effects (HTE) based on pre-specified potential effect modifiers has become a common goal in modern randomized trials. However, when one or more potential effect modifiers are missing, complete-case analysis may lead to bias and under-coverage. While statistical methods for handling missing data have been proposed and compared for individually randomized trials with missing effect modifier data, few guidelines exist for the cluster-randomized setting, where intracluster correlations in the effect modifiers, outcomes, or even missingness mechanisms may introduce further threats to accurate assessment of HTE. In this article, the performance of several missing data methods are compared through a simulation study of cluster-randomized trials with continuous outcome and missing binary effect modifier data, and further illustrated using real data from the Work, Family, and Health Study. Our results suggest that multilevel multiple imputation (MMI) and Bayesian MMI have better performance than other available methods, and that Bayesian MMI has lower bias and closer to nominal coverage than standard MMI when there are model specification or compatibility issues.
Image zooming or upsampling is a widely used tool in image processing and an essential step in many algorithms. Upsampling increases the number of pixels and introduces new information into the image, which can lead to numerical effects such as ringing artifacts, aliasing effects, and blurring of the image. In this paper, we propose an efficient polynomial interpolation algorithm based on the WENO algorithm for image upsampling that provides high accuracy in smooth regions, preserves edges and reduces aliasing effects. Although this is not the first application of WENO interpolation for image resampling, it is designed to have comparable complexity and memory load with better image quality than the separable WENO algorithm. We show that the algorithm performs equally well on smooth 2D functions, artificial pixel art, and real digital images. Comparison with similar methods on test images shows good results on standard metrics and also provides visually satisfactory results. Moreover, the low complexity of the algorithm is ensured by a small local approximation stencil and the appropriate choice of smoothness indicators.
We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations. It can train a multi-speaker variant effectively using transcripts from a single speaker. ParrotTTS adapts to a new language in low resource setup and generalizes to languages not seen while training the self-supervised backbone. Moreover, without training on bilingual or parallel examples, ParrotTTS can transfer voices across languages while preserving the speaker specific characteristics, e.g., synthesizing fluent Hindi speech using a French speaker's voice and accent. We present extensive results in monolingual and multi-lingual scenarios. ParrotTTS outperforms state-of-the-art multi-lingual TTS models using only a fraction of paired data as latter.
Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model's memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used Multi-Task Learning (MTL) datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model's performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.
Providing a promising pathway to link the human brain with external devices, Brain-Computer Interfaces (BCIs) have seen notable advancements in decoding capabilities, primarily driven by increasingly sophisticated techniques, especially deep learning. However, achieving high accuracy in real-world scenarios remains a challenge due to the distribution shift between sessions and subjects. In this paper we will explore the concept of online test-time adaptation (OTTA) to continuously adapt the model in an unsupervised fashion during inference time. Our approach guarantees the preservation of privacy by eliminating the requirement to access the source data during the adaptation process. Additionally, OTTA achieves calibration-free operation by not requiring any session- or subject-specific data. We will investigate the task of electroencephalography (EEG) motor imagery decoding using a lightweight architecture together with different OTTA techniques like alignment, adaptive batch normalization, and entropy minimization. We examine two datasets and three distinct data settings for a comprehensive analysis. Our adaptation methods produce state-of-the-art results, potentially instigating a shift in transfer learning for BCI decoding towards online adaptation.
High-dimensional variable selection, with many more covariates than observations, is widely documented in standard regression models, but there are still few tools to address it in non-linear mixed-effects models where data are collected repeatedly on several individuals. In this work, variable selection is approached from a Bayesian perspective and a selection procedure is proposed, combining the use of a spike-and-slab prior and the Stochastic Approximation version of the Expectation Maximisation (SAEM) algorithm. Similarly to Lasso regression, the set of relevant covariates is selected by exploring a grid of values for the penalisation parameter. The SAEM approach is much faster than a classical MCMC (Markov chain Monte Carlo) algorithm and our method shows very good selection performances on simulated data. Its flexibility is demonstrated by implementing it for a variety of nonlinear mixed effects models. The usefulness of the proposed method is illustrated on a problem of genetic markers identification, relevant for genomic-assisted selection in plant breeding.
In the realm of tensor optimization, low-rank tensor decomposition, particularly Tucker decomposition, stands as a pivotal technique for reducing the number of parameters and for saving storage. We embark on an exploration of Tucker tensor varieties -- the set of tensors with bounded Tucker rank -- in which the geometry is notably more intricate than the well-explored geometry of matrix varieties. We give an explicit parametrization of the tangent cone of Tucker tensor varieties and leverage its geometry to develop provable gradient-related line-search methods for optimization on Tucker tensor varieties. The search directions are computed from approximate projections of antigradient onto the tangent cone, which circumvents the calculation of intractable metric projections. To the best of our knowledge, this is the first work concerning geometry and optimization on Tucker tensor varieties. In practice, low-rank tensor optimization suffers from the difficulty of choosing a reliable rank parameter. To this end, we incorporate the established geometry and propose a Tucker rank-adaptive method that is capable of identifying an appropriate rank during iterations while the convergence is also guaranteed. Numerical experiments on tensor completion with synthetic and real-world datasets reveal that the proposed methods are in favor of recovering performance over other state-of-the-art methods. Moreover, the rank-adaptive method performs the best across various rank parameter selections and is indeed able to find an appropriate rank.
One essential problem in quantifying the collective behaviors of molecular systems lies in the accurate construction of free energy surfaces (FESs). The main challenges arise from the prevalence of energy barriers and the high dimensionality. Existing approaches are often based on sophisticated enhanced sampling methods to establish efficient exploration of the full-phase space. On the other hand, the collection of optimal sample points for the numerical approximation of FESs remains largely under-explored, where the discretization error could become dominant for systems with a large number of collective variables (CVs). We propose a consensus sampling-based approach by reformulating the construction as a minimax problem which simultaneously optimizes the function representation and the training set. In particular, the maximization step establishes a stochastic interacting particle system to achieve the adaptive sampling of the max-residue regime by modulating the exploitation of the Laplace approximation of the current loss function and the exploration of the uncharted phase space; the minimization step updates the FES approximation with the new training set. By iteratively solving the minimax problem, the present method essentially achieves an adversarial learning of the FESs with unified tasks for both phase space exploration and posterior error-enhanced sampling. We demonstrate the method by constructing the FESs of molecular systems with a number of CVs up to 30.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.