We study a general model for continuous spin systems with hard-core interactions. Our model allows for a mixture of $q$ types of particles on a $d$-dimensional Euclidean region $\mathbb{V}$ of volume $\nu(\mathbb{V})$. For each type, particle positions are distributed according to a Poisson point process. The Gibbs distribution over possible system states is given by the mixture of these point processes conditioned that no two particles are closer than some distance parameterized by a $q \times q$ matrix. This model encompasses classical continuous spin systems, such as the hard-sphere model or the Widom-Rowlinson model. We present sufficient conditions for approximating the partition function of this model, which is the normalizing factor of its Gibbs measure. For the hard-sphere model, our method yields a randomized approximation algorithm with running time polynomial in $\nu(\mathbb{V})$ for the known uniqueness regime of the Gibbs measure. In the same parameter regime, we obtain a quasi-polynomial deterministic approximation, which, to our knowledge, is the first rigorous deterministic algorithm for a continuous spin system. We obtain similar approximation results for all continuous spin systems captured by our model and, in particular, the first explicit approximation bounds for the Widom-Rowlinson model. Additionally, we show how to obtain efficient approximate samplers for the Gibbs distributions of the respective spin systems within the same parameter regimes. Key to our method is reducing the continuous model to a discrete instance of the hard-core model with size polynomial in $\nu(\mathbb{V})$. This generalizes existing discretization schemes for the hard-sphere model and, additionally, improves the required number of vertices of the generated graph from super-exponential to quadratic in $\nu(\mathbb{V})$, which we argue to be tight.
Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly consistent estimators, and those which are only approximately consistent. While consistency is a desirable property, it is typically attained only under specific model assumptions. Two techniques, regression calibration and simulation extrapolation, are used frequently in a wide variety of parametric and semiparametric settings. However, in many settings these methods are only approximately consistent. We generalize these corrections, relaxing assumptions placed on replicate measurements. Under regularity conditions, the estimators are shown to be asymptotically normal, with a sandwich estimator for the asymptotic variance. Through simulation, we demonstrate the improved performance of the modified estimators, over the standard techniques, when these assumptions are violated. We motivate these corrections using the Framingham Heart Study, and apply the generalized techniques to an analysis of these data.
We consider the classic online problem of scheduling on a single machine to minimize total flow time. In STOC 2021, the concept of robustness to distortion in processing times was introduced: for every distortion factor $\mu$, an $O(\mu^2)$-competitive algorithm $\operatorname{ALG}_{\mu}$ which handles distortions up to $\mu$ was presented. However, using that result requires one to know the distortion of the input in advance, which is impractical. We present the first \emph{distortion-oblivious} algorithms: algorithms which are competitive for \emph{every} input of \emph{every} distortion, and thus do not require knowledge of the distortion in advance. Moreover, the competitive ratios of our algorithms are $\tilde{O}(\mu)$, which is a quadratic improvement over the algorithm from STOC 2021, and is nearly optimal (we show a randomized lower bound of $\Omega(\mu)$ on competitiveness).
Throughput is a main performance objective in communication networks. This paper considers a fundamental maximum throughput routing problem -- the all-or-nothing multicommodity flow (ANF) problem -- in arbitrary directed graphs and in the practically relevant but challenging setting where demands can be (much) larger than the edge capacities. Hence, in addition to assigning requests to valid flows for each routed commodity, an admission control mechanism is required which prevents overloading the network when routing commodities. We make several contributions. On the theoretical side we obtain substantially improved bi-criteria approximation algorithms for this NP-hard problem. We present two non-trivial linear programming relaxations and show how to convert their fractional solutions into integer solutions via randomized rounding. One is an exponential-size formulation (solvable in polynomial time using a separation oracle) that considers a ``packing'' view and allows a more flexible approach, while the other is a generalization of the compact LP formulation of Liu et al. (INFOCOM'19) that allows for easy solving via standard LP solvers. We obtain a polynomial-time randomized algorithm that yields an arbitrarily good approximation on the weighted throughput while violating the edge capacity constraints by only a small multiplicative factor. We also describe a deterministic rounding algorithm by derandomization, using the method of pessimistic estimators. We complement our theoretical results with a proof of concept empirical evaluation.
Within a Bayesian framework, a comprehensive investigation of mixtures of finite mixtures (MFMs), i.e., finite mixtures with a prior on the number of components, is performed. This model class has applications in model-based clustering as well as for semi-parametric density estimation and requires suitable prior specifications and inference methods to exploit its full potential. We contribute by considering a generalized class of MFMs where the hyperparameter $\gamma_K$ of a symmetric Dirichlet prior on the weight distribution depends on the number of components. We show that this model class may be regarded as a Bayesian non-parametric mixture outside the class of Gibbs-type priors. We emphasize the distinction between the number of components $K$ of a mixture and the number of clusters $K_+$, i.e., the number of filled components given the data. In the MFM model, $K_+$ is a random variable and its prior depends on the prior on $K$ and on the hyperparameter $\gamma_K$. We employ a flexible prior distribution for the number of components $K$ and derive the corresponding prior on the number of clusters $K_+$ for generalized MFMs. For posterior inference, we propose the novel telescoping sampler which allows Bayesian inference for mixtures with arbitrary component distributions without resorting to reversible jump Markov chain Monte Carlo (MCMC) methods. The telescoping sampler explicitly samples the number of components, but otherwise requires only the usual MCMC steps of a finite mixture model. The ease of its application using different component distributions is demonstrated on several data sets.
The greedy algorithm for monotone submodular function maximization subject to cardinality constraint is guaranteed to approximate the optimal solution to within a $1-1/e$ factor. Although it is well known that this guarantee is essentially tight in the worst case -- for greedy and in fact any efficient algorithm, experiments show that greedy performs better in practice. We observe that for many applications in practice, the empirical distribution of the budgets (i.e., cardinality constraints) is supported on a wide range, and moreover, all the existing hardness results in theory break under a large perturbation of the budget. To understand the effect of the budget from both algorithmic and hardness perspectives, we introduce a new notion of budget smoothed analysis. We prove that greedy is optimal for every budget distribution, and we give a characterization for the worst-case submodular functions. Based on these results, we show that on the algorithmic side, under realistic budget distributions, greedy and related algorithms enjoy provably better approximation guarantees, that hold even for worst-case functions, and on the hardness side, there exist hard functions that are fairly robust to all the budget distributions.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.
Methods that align distributions by minimizing an adversarial distance between them have recently achieved impressive results. However, these approaches are difficult to optimize with gradient descent and they often do not converge well without careful hyperparameter tuning and proper initialization. We investigate whether turning the adversarial min-max problem into an optimization problem by replacing the maximization part with its dual improves the quality of the resulting alignment and explore its connections to Maximum Mean Discrepancy. Our empirical results suggest that using the dual formulation for the restricted family of linear discriminators results in a more stable convergence to a desirable solution when compared with the performance of a primal min-max GAN-like objective and an MMD objective under the same restrictions. We test our hypothesis on the problem of aligning two synthetic point clouds on a plane and on a real-image domain adaptation problem on digits. In both cases, the dual formulation yields an iterative procedure that gives more stable and monotonic improvement over time.
In this paper we introduce a covariance framework for the analysis of EEG and MEG data that takes into account observed temporal stationarity on small time scales and trial-to-trial variations. We formulate a model for the covariance matrix, which is a Kronecker product of three components that correspond to space, time and epochs/trials, and consider maximum likelihood estimation of the unknown parameter values. An iterative algorithm that finds approximations of the maximum likelihood estimates is proposed. We perform a simulation study to assess the performance of the estimator and investigate the influence of different assumptions about the covariance factors on the estimated covariance matrix and on its components. Apart from that, we illustrate our method on real EEG and MEG data sets. The proposed covariance model is applicable in a variety of cases where spontaneous EEG or MEG acts as source of noise and realistic noise covariance estimates are needed for accurate dipole localization, such as in evoked activity studies, or where the properties of spontaneous EEG or MEG are themselves the topic of interest, such as in combined EEG/fMRI experiments in which the correlation between EEG and fMRI signals is investigated.