Clustering in high-dimensions poses many statistical challenges. While traditional distance-based clustering methods are computationally feasible, they lack probabilistic interpretation and rely on heuristics for estimation of the number of clusters. On the other hand, probabilistic model-based clustering techniques often fail to scale and devising algorithms that are able to effectively explore the posterior space is an open problem. Based on recent developments in Bayesian distance-based clustering, we propose a hybrid solution that entails defining a likelihood on pairwise distances between observations. The novelty of the approach consists in including both cohesion and repulsion terms in the likelihood, which allows for cluster identifiability. This implies that clusters are composed of objects which have small "dissimilarities" among themselves (cohesion) and similar dissimilarities to observations in other clusters (repulsion). We show how this modelling strategy has interesting connection with existing proposals in the literature as well as a decision-theoretic interpretation. The proposed method is computationally efficient and applicable to a wide variety of scenarios. We demonstrate the approach in a simulation study and an application in digital numismatics.
Functional data are ubiquitous in scientific modeling. For instance, quantities of interest are modeled as functions of time, space, energy, density, etc. Uncertainty quantification methods for computer models with functional response have resulted in tools for emulation, sensitivity analysis, and calibration that are widely used. However, many of these tools do not perform well when the model's parameters control both the amplitude variation of the functional output and its alignment (or phase variation). This paper introduces a framework for Bayesian model calibration when the model responses are misaligned functional data. The approach generates two types of data out of the misaligned functional responses: one that isolates the amplitude variation and one that isolates the phase variation. These two types of data are created for the computer simulation data (both of which may be emulated) and the experimental data. The calibration approach uses both types so that it seeks to match both the amplitude and phase of the experimental data. The framework is careful to respect constraints that arise especially when modeling phase variation, but also in a way that it can be done with readily available calibration software. We demonstrate the techniques on a simulated data example and on two dynamic material science problems: a strength model calibration using flyer plate experiments and an equation of state model calibration using experiments performed on the Sandia National Laboratories' Z-machine.
Modern statistical learning algorithms are capable of amazing flexibility, but struggle with interpretability. One possible solution is sparsity: making inference such that many of the parameters are estimated as being identically 0, which may be imposed through the use of nonsmooth penalties such as the $\ell_1$ penalty. However, the $\ell_1$ penalty introduces significant bias when high sparsity is desired. In this article, we retain the $\ell_1$ penalty, but define learnable penalty weights $\lambda_p$ endowed with hyperpriors. We start the article by investigating the optimization problem this poses, developing a proximal operator associated with the $\ell_1$ norm. We then study the theoretical properties of this variable-coefficient $\ell_1$ penalty in the context of penalized likelihood. Next, we investigate application of this penalty to Variational Bayes, developing a model we call the Sparse Bayesian Lasso which allows for behavior qualitatively like Lasso regression to be applied to arbitrary variational models. In simulation studies, this gives us the Uncertainty Quantification and low bias properties of simulation-based approaches with an order of magnitude less computation. Finally, we apply our methodology to a Bayesian lagged spatiotemporal regression model of internal displacement that occurred during the Iraqi Civil War of 2013-2017.
Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent regularizers have been recently introduced in the context of conventional frequentist learning to penalize deviations between confidence and accuracy. In contrast, data-independent regularizers are at the core of Bayesian learning, enforcing adherence of the variational distribution in the model parameter space to a prior density. The former approach is unable to quantify epistemic uncertainty, while the latter is severely affected by model misspecification. In light of the limitations of both methods, this paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs), that applies both regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
The recently proposed training-free NAS methods abandon the training phase and design various zero-cost proxies as scores to identify excellent architectures, arousing extreme computational efficiency for neural architecture search. In this paper, we raise an interesting problem: can we properly measure the operation importance in DARTS through a training-free way, with avoiding the parameter-intensive bias? We investigate this question through the lens of edge connectivity, and provide an affirmative answer by defining a connectivity concept, ZERo-cost Operation Sensitivity (ZEROS), to score the importance of candidate operations in DARTS at initialization. By devising an iterative and data-agnostic manner in utilizing ZEROS for NAS, our novel trial leads to a framework called training free differentiable architecture search (FreeDARTS). Based on the theory of Neural Tangent Kernel (NTK), we show the proposed connectivity score provably negatively correlated with the generalization bound of DARTS supernet after convergence under gradient descent training. In addition, we theoretically explain how ZEROS implicitly avoids parameter-intensive bias in selecting architectures, and empirically show the searched architectures by FreeDARTS are of comparable size. Extensive experiments have been conducted on a series of search spaces, and results have demonstrated that FreeDARTS is a reliable and efficient baseline for neural architecture search.
We present a statistical inference approach to estimate the frequency noise characteristics of ultra-narrow linewidth lasers from delayed self-heterodyne beat note measurements using Bayesian inference. Particular emphasis is on estimation of the intrinsic (Lorentzian) laser linewidth. The approach is based on a statistical model of the measurement process, taking into account the effects of the interferometer as well as the detector noise. Our method therefore yields accurate results even when the intrinsic linewidth plateau is obscured by detector noise. The regression is performed on periodogram data in the frequency domain using a Markov-chain Monte Carlo method. By using explicit knowledge about the statistical distribution of the observed data, the method yields good results already from a single time series and does not rely on averaging over many realizations, since the information in the available data is evaluated very thoroughly. The approach is demonstrated for simulated time series data from a stochastic laser rate equation model with 1/f-type non-Markovian noise.
Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with re-parameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained multi-fidelity optimization task involving shape optimization of rotor blades in turbo-machinery.
In applications where the study data are collected within cluster units (e.g., patients within transplant centers), it is often of interest to estimate and perform inference on the treatment effects of the cluster units. However, it is well-established that cluster-level confounding variables can bias these assessments, and many of these confounding factors may be unobservable. In healthcare settings, data sharing restrictions often make it impossible to directly fit conventional risk-adjustment models on patient-level data, and existing privacy-preserving approaches cannot adequately adjust for both observed and unobserved cluster-level confounding factors. In this paper, we propose a privacy-preserving model for cluster-level confounding that only depends on publicly-available summary statistics, can be fit using a single optimization routine, and is robust to outlying cluster unit effects. In addition, we develop a Pseudo-Bayesian inference procedure that accounts for the estimated cluster-level confounding effects and corrects for the impact of unobservable factors. Simulations show that our estimates are robust and accurate, and the proposed inference approach has better Frequentist properties than existing methods. Motivated by efforts to improve equity in transplant care, we apply these methods to evaluate transplant centers while adjusting for observed geographic disparities in donor organ availability and unobservable confounders.
Several precise and computationally efficient results for pointing errors models in two asymptotic cases are derived in this paper. The normalized mean-squared error (NMSE) performance metric is employed to quantify the accuracy of different models. For the case that the beam width is relatively larger than the detection aperture, we propose the three kinds of models that have the form of $c_1\exp(-c_2r^2) $.It is shown that the modified intensity uniform model not only achieves a comparable accuracy with the best linearized model, but also is expressed in an elegant mathematical way when compared to the traditional Farid model. This indicates that the modified intensity uniform model is preferable in the performance analysis of free space optical (FSO) systems considering the effects of the pointing errors. By analogizing the beam spot with a point in the case that beam width is smaller than the detection aperture, the solution of the pointing errors model is transformed to a smooth function approximation problem, and we find that a more accurate approximation can be achieved by the proposed point approximation model when compared to the model that is induced from the Vasylyev model in some scenarios.
Graph clustering, which aims to divide the nodes in the graph into several distinct clusters, is a fundamental and challenging task. In recent years, deep graph clustering methods have been increasingly proposed and achieved promising performance. However, the corresponding survey paper is scarce and it is imminent to make a summary in this field. From this motivation, this paper makes the first comprehensive survey of deep graph clustering. Firstly, the detailed definition of deep graph clustering and the important baseline methods are introduced. Besides, the taxonomy of deep graph clustering methods is proposed based on four different criteria including graph type, network architecture, learning paradigm, and clustering method. In addition, through the careful analysis of the existing works, the challenges and opportunities from five perspectives are summarized. At last, the applications of deep graph clustering in four domains are presented. It is worth mentioning that a collection of state-of-the-art deep graph clustering methods including papers, codes, and datasets is available on GitHub. We hope this work will serve as a quick guide and help researchers to overcome challenges in this vibrant field.
Clustering is a fundamental machine learning task which has been widely studied in the literature. Classic clustering methods follow the assumption that data are represented as features in a vectorized form through various representation learning techniques. As the data become increasingly complicated and complex, the shallow (traditional) clustering methods can no longer handle the high-dimensional data type. With the huge success of deep learning, especially the deep unsupervised learning, many representation learning techniques with deep architectures have been proposed in the past decade. Recently, the concept of Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community. Motivated by the tremendous success of deep learning in clustering, one of the most fundamental machine learning tasks, and the large number of recent advances in this direction, in this paper we conduct a comprehensive survey on deep clustering by proposing a new taxonomy of different state-of-the-art approaches. We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering. Moreover, this survey also provides the popular benchmark datasets, evaluation metrics and open-source implementations to clearly illustrate various experimental settings. Last but not least, we discuss the practical applications of deep clustering and suggest challenging topics deserving further investigations as future directions.