The Lov\'{a}sz Local Lemma (LLL) is a keystone principle in probability theory, guaranteeing the existence of configurations which avoid a collection $\mathcal B$ of "bad" events which are mostly independent and have low probability. In its simplest "symmetric" form, it asserts that whenever a bad-event has probability $p$ and affects at most $d$ bad-events, and $e p d < 1$, then a configuration avoiding all $\mathcal B$ exists. A seminal algorithm of Moser & Tardos (2010) gives nearly-automatic randomized algorithms for most constructions based on the LLL. However, deterministic algorithms have lagged behind. We address three specific shortcomings of the prior deterministic algorithms. First, our algorithm applies to the LLL criterion of Shearer (1985); this is more powerful than alternate LLL criteria and also removes a number of nuisance parameters and leads to cleaner and more legible bounds. Second, we provide parallel algorithms with much greater flexibility in the functional form of of the bad-events. Third, we provide a derandomized version of the MT-distribution, that is, the distribution of the variables at the termination of the MT algorithm. We show applications to non-repetitive vertex coloring, independent transversals, strong coloring, and other problems. These give deterministic algorithms which essentially match the best previous randomized sequential and parallel algorithms.
We consider stochastic approximations of sampling algorithms, such as Stochastic Gradient Langevin Dynamics (SGLD) and the Random Batch Method (RBM) for Interacting Particle Dynamcs (IPD). We observe that the noise introduced by the stochastic approximation is nearly Gaussian due to the Central Limit Theorem (CLT) while the driving Brownian motion is exactly Gaussian. We harness this structure to absorb the stochastic approximation error inside the diffusion process, and obtain improved convergence guarantees for these algorithms. For SGLD, we prove the first stable convergence rate in KL divergence without requiring uniform warm start, assuming the target density satisfies a Log-Sobolev Inequality. Our result implies superior first-order oracle complexity compared to prior works, under significantly milder assumptions. We also prove the first guarantees for SGLD under even weaker conditions such as H\"{o}lder smoothness and Poincare Inequality, thus bridging the gap between the state-of-the-art guarantees for LMC and SGLD. Our analysis motivates a new algorithm called covariance correction, which corrects for the additional noise introduced by the stochastic approximation by rescaling the strength of the diffusion. Finally, we apply our techniques to analyze RBM, and significantly improve upon the guarantees in prior works (such as removing exponential dependence on horizon), under minimal assumptions.
We consider Bayesian analysis on high-dimensional spheres with angular central Gaussian priors. These priors model antipodally symmetric directional data, are easily defined in Hilbert spaces and occur, for instance, in Bayesian binary classification and level set inversion. In this paper we derive efficient Markov chain Monte Carlo methods for approximate sampling of posteriors with respect to these priors. Our approaches rely on lifting the sampling problem to the ambient Hilbert space and exploit existing dimension-independent samplers in linear spaces. By a push-forward Markov kernel construction we then obtain Markov chains on the sphere, which inherit reversibility and spectral gap properties from samplers in linear spaces. Moreover, our proposed algorithms show dimension-independent efficiency in numerical experiments.
We study the Langevin-type algorithms for Gibbs distributions such that the potentials are dissipative and their weak gradients have the finite moduli of continuity. Our main result is a non-asymptotic upper bound of the 2-Wasserstein distance between the Gibbs distribution and the law of general Langevin-type algorithms based on the Liptser--Shiryaev theory and functional inequalities. We apply this bound to show that the dissipativity of the potential and the $\alpha$-H\"{o}lder continuity of the gradient with $\alpha>1/3$ are sufficient for the convergence of the Langevin Monte Carlo algorithm with appropriate control of the parameters. We also propose Langevin-type algorithms with spherical smoothing for potentials without convexity or continuous differentiability.
Models with intractable normalising functions have numerous applications. Because the normalising constants are functions of the parameters of interest, standard Markov chain Monte Carlo cannot be used for Bayesian inference for these models. Many algorithms have been developed for such models. Some have the posterior distribution as the asymptotic distribution. Other ``asymptotically inexact'' algorithms do not possess this property. There is limited guidance for evaluating approximations based on these algorithms. We propose two new diagnostics that address these problems. We provide theoretical justification for our methods and apply them to several algorithms in the context of challenging examples.
Three asymptotic limits exist for the Euler equations at low Mach number - purely convective, purely acoustic, and mixed convective-acoustic. Standard collocated density-based numerical schemes for compressible flow are known to fail at low Mach number due to the incorrect asymptotic scaling of the artificial diffusion. Previous studies of this class of schemes have shown a variety of behaviours across the different limits and proposed guidelines for the design of low-Mach schemes. However, these studies have primarily focused on specific discretisations and/or only the convective limit. In this paper, we review the low-Mach behaviour using the modified equations - the continuous Euler equations augmented with artificial diffusion terms - which are representative of a wide range of schemes in this class. By considering both convective and acoustic effects, we show that three diffusion scalings naturally arise. Single- and multiple-scale asymptotic analysis of these scalings shows that many of the important low-Mach features of this class of schemes can be reproduced in a straightforward manner in the continuous setting. As an example, we show that many existing low-Mach Roe-type finite-volume schemes match one of these three scalings. Our analysis corroborates previous analysis of these schemes, and we are able to refine previous guidelines on the design of low-Mach schemes by including both convective and acoustic effects. Discrete analysis and numerical examples demonstrate the behaviour of minimal Roe-type schemes with each of the three scalings for convective, acoustic, and mixed flows.
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise characterization of the equivalence even under negative regularization, including a precise characterization of the smallest nonzero eigenvalue of the sketched matrix, which may be of independent interest. We then further characterize the second-order equivalence of the sketched pseudoinverse. We also apply our results to the analysis of the sketch-and-project method and to sketched ridge regression. Lastly, we propose a conjecture that these results generalize to asymptotically free sketching matrices, obtaining the resulting equivalence for orthogonal sketching matrices and comparing our results to several common sketches used in practice.
Typical cooperative multi-agent systems (MASs) exchange information to coordinate their motion in proximity-based control consensus schemes to complete a common objective. However, in the event of faults or cyber attacks to on-board positioning sensors of agents, global control performance may be compromised resulting in a hijacking of the entire MAS. For systems that operate in unknown or landmark-free environments (e.g., open terrain, sea, or air) and also beyond range/proximity sensing of nearby agents, compromised agents lose localization capabilities. To maintain resilience in these scenarios, we propose a method to recover compromised agents by utilizing Received Signal Strength Indication (RSSI) from nearby agents (i.e., mobile landmarks) to provide reliable position measurements for localization. To minimize estimation error: i) a multilateration scheme is proposed to leverage RSSI and position information received from neighboring agents as mobile landmarks and ii) a Kalman filtering method adaptively updates the unknown RSSI-based position measurement covariance matrix at runtime that is robust to unreliable state estimates. The proposed framework is demonstrated with simulations on MAS formations in the presence of faults and cyber attacks to on-board position sensors.
In high-dimensional time-series analysis, it is essential to have a set of key factors (namely, the style factors) that explain the change of the observed variable. For example, volatility modeling in finance relies on a set of risk factors, and climate change studies in climatology rely on a set of causal factors. The ideal low-dimensional style factors should balance significance (with high explanatory power) and stability (consistent, no significant fluctuations). However, previous supervised and unsupervised feature extraction methods can hardly address the tradeoff. In this paper, we propose Style Miner, a reinforcement learning method to generate style factors. We first formulate the problem as a Constrained Markov Decision Process with explanatory power as the return and stability as the constraint. Then, we design fine-grained immediate rewards and costs and use a Lagrangian heuristic to balance them adaptively. Experiments on real-world financial data sets show that Style Miner outperforms existing learning-based methods by a large margin and achieves a relatively 10% gain in R-squared explanatory power compared to the industry-renowned factors proposed by human experts.
We study the corrupted bandit problem, i.e. a stochastic multi-armed bandit problem with $k$ unknown reward distributions, which are heavy-tailed and corrupted by a history-independent adversary or Nature. To be specific, the reward obtained by playing an arm comes from corresponding heavy-tailed reward distribution with probability $1-\varepsilon \in (0.5,1]$ and an arbitrary corruption distribution of unbounded support with probability $\varepsilon \in [0,0.5)$. First, we provide $\textit{a problem-dependent lower bound on the regret}$ of any corrupted bandit algorithm. The lower bounds indicate that the corrupted bandit problem is harder than the classical stochastic bandit problem with sub-Gaussian or heavy-tail rewards. Following that, we propose a novel UCB-type algorithm for corrupted bandits, namely HubUCB, that builds on Huber's estimator for robust mean estimation. Leveraging a novel concentration inequality of Huber's estimator, we prove that HubUCB achieves a near-optimal regret upper bound. Since computing Huber's estimator has quadratic complexity, we further introduce a sequential version of Huber's estimator that exhibits linear complexity. We leverage this sequential estimator to design SeqHubUCB that enjoys similar regret guarantees while reducing the computational burden. Finally, we experimentally illustrate the efficiency of HubUCB and SeqHubUCB in solving corrupted bandits for different reward distributions and different levels of corruptions.
Hierarchical learning algorithms that gradually approximate a solution to a data-driven optimization problem are essential to decision-making systems, especially under limitations on time and computational resources. In this study, we introduce a general-purpose hierarchical learning architecture that is based on the progressive partitioning of a possibly multi-resolution data space. The optimal partition is gradually approximated by solving a sequence of optimization sub-problems that yield a sequence of partitions with increasing number of subsets. We show that the solution of each optimization problem can be estimated online using gradient-free stochastic approximation updates. As a consequence, a function approximation problem can be defined within each subset of the partition and solved using the theory of two-timescale stochastic approximation algorithms. This simulates an annealing process and defines a robust and interpretable heuristic method to gradually increase the complexity of the learning architecture in a task-agnostic manner, giving emphasis to regions of the data space that are considered more important according to a predefined criterion. Finally, by imposing a tree structure in the progression of the partitions, we provide a means to incorporate potential multi-resolution structure of the data space into this approach, significantly reducing its complexity, while introducing hierarchical variable-rate feature extraction properties similar to certain classes of deep learning architectures. Asymptotic convergence analysis and experimental results are provided for supervised and unsupervised learning problems.