Analog over-the-air computation (OAC) is an efficient solution to a class of uplink data aggregation tasks over a multiple-access channel (MAC), wherein the receiver, dubbed the fusion center, aims to reconstruct a function of the data distributed at edge devices rather than the individual data themselves. Existing OAC relies exclusively on the maximum likelihood (ML) estimation at the fusion center to recover the arithmetic sum of the transmitted signals from different devices. ML estimation, however, is much susceptible to noise. In particular, in the misaligned OAC where there are channel misalignments among transmitted signals, ML estimation suffers from severe error propagation and noise enhancement. To address these challenges, this paper puts forth a Bayesian approach for OAC by letting each edge device transmit two pieces of prior information to the fusion center. Three OAC systems are studied: the aligned OAC with perfectly-aligned signals; the synchronous OAC with misaligned channel gains among the received signals; and the asynchronous OAC with both channel-gain and time misalignments. Using the prior information, we devise linear minimum mean squared error (LMMSE) estimators and a sum-product maximum a posteriori (SP-MAP) estimator for the three OAC systems. Numerical results verify that, 1) For the aligned and synchronous OAC, our LMMSE estimator significantly outperforms the ML estimator. In the low signal-to-noise ratio (SNR) regime, the LMMSE estimator reduces the mean squared error (MSE) by at least 6 dB; in the high SNR regime, the LMMSE estimator lowers the error floor on the MSE by 86.4%; 2) For the asynchronous OAC, our LMMSE and SP-MAP estimators are on an equal footing in terms of the MSE performance, and are significantly better than the ML estimator.
We present an analytical framework for the channel estimation and the data detection in massive multiple-input multiple-output uplink systems with 1-bit analog-to-digital converters (ADCs) and i.i.d. Rayleigh fading. First, we provide closed-form expressions of the mean squared error (MSE) of the channel estimation considering the state-of-the-art linear minimum MSE estimator and the class of scaled least-squares estimators. For the data detection, we provide closed-form expressions of the expected value and the variance of the estimated symbols when maximum ratio combining is adopted, which can be exploited to efficiently implement minimum distance detection and, potentially, to design the set of transmit symbols. Our analytical findings explicitly depend on key system parameters such as the signal-to-noise ratio (SNR), the number of user equipments, and the pilot length, thus enabling a precise characterization of the performance of the channel estimation and the data detection with 1-bit ADCs. The proposed analysis highlights a fundamental SNR trade-off, according to which operating at the right noise level significantly enhances the system performance.
We consider Bayesian multiple hypothesis problem with independent and identically distributed observations. The classical, Sanov's theorem-based, analysis of the error probability allows one to characterize the best achievable error exponent. However, this analysis does not generalize to the case where the true distributions of the hypothesis are not exact or partially known via some nominal distributions. This problem has practical significance, because the nominal distributions may be quantized versions of the true distributions in a hardware implementation, or they may be estimates of the true distributions obtained from labeled training sequences as in statistical classification. In this paper, we develop a type-based analysis to investigate Bayesian multiple hypothesis testing problem. Our analysis allows one to explicitly calculate the error exponent of a given type and extends the classical analysis. As a generalization of the proposed method, we derive a robust test and obtain its error exponent for the case where the hypothesis distributions are not known but there exist nominal distribution that are close to true distributions in variational distance.
In this paper, an abstract framework for the error analysis of discontinuous finite element method is developed for the distributed and Neumann boundary control problems governed by the stationary Stokes equation with control constraints. {\it A~priori} error estimates of optimal order are derived for velocity and pressure in the energy norm and the $L^2$-norm, respectively. Moreover, a reliable and efficient {\it a~posteriori} error estimator is derived. The results are applicable to a variety of problems just under the minimal regularity possessed by the well-posedness of the problem. In particular, we consider the abstract results with suitable stable pairs of velocity and pressure spaces like as the lowest-order Crouzeix-Raviart finite element and piecewise constant spaces, piecewise linear and constant finite element spaces. The theoretical results are illustrated by the numerical experiments.
Graphs have been commonly used to represent complex data structures. In models dealing with graph-structured data, multivariate parameters may not only exhibit sparse patterns but have structured sparsity and smoothness in the sense that both zero and non-zero parameters tend to cluster together. We propose a new prior for high-dimensional parameters with graphical relations, referred to as the Tree-based Low-rank Horseshoe (T-LoHo) model, that generalizes the popular univariate Bayesian horseshoe shrinkage prior to the multivariate setting to detect structured sparsity and smoothness simultaneously. The T-LoHo prior can be embedded in many high-dimensional hierarchical models. To illustrate its utility, we apply it to regularize a Bayesian high-dimensional regression problem where the regression coefficients are linked by a graph, so that the resulting clusters have flexible shapes and satisfy the cluster contiguity constraint with respect to the graph. We design an efficient Markov chain Monte Carlo algorithm that delivers full Bayesian inference with uncertainty measures for model parameters such as the number of clusters. We offer theoretical investigations of the clustering effects and posterior concentration results. Finally, we illustrate the performance of the model with simulation studies and a real data application for anomaly detection on a road network. The results indicate substantial improvements over other competing methods such as the sparse fused lasso.
A performance prediction method for massively parallel computation is proposed. The method is based on performance modeling and Bayesian inference to predict elapsed time T as a function of the number of used nodes P (T=T(P)). The focus is on extrapolation for larger values of P from the perspective of application researchers. The proposed method has several improvements over the method developed in a previous paper, and application to real-symmetric generalized eigenvalue problem shows promising prediction results. The method is generalizable and applicable to many other computations.
In recent years, local differential privacy (LDP) has emerged as a technique of choice for privacy-preserving data collection in several scenarios when the aggregator is not trustworthy. LDP provides client-side privacy by adding noise at the user's end. Thus, clients need not rely on the trustworthiness of the aggregator. In this work, we provide a noise-aware probabilistic modeling framework, which allows Bayesian inference to take into account the noise added for privacy under LDP, conditioned on locally perturbed observations. Stronger privacy protection (compared to the central model) provided by LDP protocols comes at a much harsher privacy-utility trade-off. Our framework tackles several computational and statistical challenges posed by LDP for accurate uncertainty quantification under Bayesian settings. We demonstrate the efficacy of our framework in parameter estimation for univariate and multi-variate distributions as well as logistic and linear regression.
In this paper we study properties of the Laplace approximation of the posterior distribution arising in nonlinear Bayesian inverse problems. Our work is motivated by Schillings et al. (2020), where it is shown that in such a setting the Laplace approximation error in Hellinger distance converges to zero in the order of the noise level. Here, we prove novel error estimates for a given noise level that also quantify the effect due to the nonlinearity of the forward mapping and the dimension of the problem. In particular, we are interested in settings in which a linear forward mapping is perturbed by a small nonlinear mapping. Our results indicate that in this case, the Laplace approximation error is of the size of the perturbation. The paper provides insight into Bayesian inference in nonlinear inverse problems, where linearization of the forward mapping has suitable approximation properties.
Much stringent reliability and processing latency requirements in ultra-reliable-low-latency-communication (URLLC) traffic make the design of linear massive multiple-input-multiple-output (M-MIMO) receivers becomes very challenging. Recently, Bayesian concept has been used to increase the detection reliability in minimum-mean-square-error (MMSE) linear receivers. However, the latency processing time is a major concern due to the exponential complexity of matrix inversion operations in MMSE schemes. This paper proposes an iterative M-MIMO receiver that is developed by using a Bayesian concept and a parallel interference cancellation (PIC) scheme, referred to as a linear Bayesian learning (LBL) receiver. PIC has a linear complexity as it uses a combination of maximum ratio combining (MRC) and decision statistic combining (DSC) schemes to avoid matrix inversion operations. Simulation results show that the bit-error-rate (BER) and latency processing performances of the proposed receiver outperform the ones of MMSE and best Bayesian-based receivers by minimum $2$ dB and $19$ times for various M-MIMO system configurations.
Optimizing multiple competing black-box objectives is a challenging problem in many fields, including science, engineering, and machine learning. Multi-objective Bayesian optimization (MOBO) is a sample-efficient approach for identifying the optimal trade-offs between the objectives. However, many existing methods perform poorly when the observations are corrupted by noise. We propose a novel acquisition function, NEHVI, that overcomes this important practical limitation by applying a Bayesian treatment to the popular expected hypervolume improvement (EHVI) criterion and integrating over this uncertainty in the Pareto frontier. We argue that, even in the noiseless setting, generating multiple candidates in parallel is an incarnation of EHVI with uncertainty in the Pareto frontier and therefore can be addressed using the same underlying technique. Through this lens, we derive a natural parallel variant, $q$NEHVI, that reduces computational complexity of parallel EHVI from exponential to polynomial with respect to the batch size. $q$NEHVI is one-step Bayes-optimal for hypervolume maximization in both noisy and noiseless environments, and we show that it can be optimized effectively with gradient-based methods via sample average approximation. Empirically, we demonstrate not only that $q$NEHVI is substantially more robust to observation noise than existing MOBO approaches, but also that it achieves state-of-the-art optimization performance and competitive wall-times in large-batch environments.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.