The qubit routing problem, also known as the swap minimization problem, is a (classical) combinatorial optimization problem that arises in the design of compilers of quantum programs. We study the qubit routing problem from the viewpoint of theoretical computer science, while most of the existing studies investigated the practical aspects. We concentrate on the linear nearest neighbor (LNN) architectures of quantum computers, in which the graph topology is a path. Our results are three-fold. (1) We prove that the qubit routing problem is NP-hard. (2) We give a fixed-parameter algorithm when the number of two-qubit gates is a parameter. (3) We give a polynomial-time algorithm when each qubit is involved in at most one two-qubit gate.
We develop an efficient algorithmic approach for approximate counting and sampling in the low-temperature regime of a broad class of statistical physics models on finite subsets of the lattice $\mathbb Z^d$ and on the torus $(\mathbb Z/n \mathbb Z)^d$. Our approach is based on combining contour representations from Pirogov-Sinai theory with Barvinok's approach to approximate counting using truncated Taylor series. Some consequences of our main results include an FPTAS for approximating the partition function of the hard-core model at sufficiently high fugacity on subsets of $\mathbb Z^d$ with appropriate boundary conditions and an efficient sampling algorithm for the ferromagnetic Potts model on the discrete torus $(\mathbb Z/n \mathbb Z)^d$ at sufficiently low temperature.
In this paper we study the threshold model of \emph{geometric inhomogeneous random graphs} (GIRGs); a generative random graph model that is closely related to \emph{hyperbolic random graphs} (HRGs). These models have been observed to capture complex real-world networks well with respect to the structural and algorithmic properties. Following comprehensive studies regarding their \emph{connectivity}, i.e., which parts of the graphs are connected, we have a good understanding under which circumstances a \emph{giant} component (containing a constant fraction of the graph) emerges. While previous results are rather technical and challenging to work with, the goal of this paper is to provide more accessible proofs. At the same time we significantly improve the previously known probabilistic guarantees, showing that GIRGs contain a giant component with probability $1 - \exp(-\Omega(n^{(3-\tau)/2}))$ for graph size $n$ and a degree distribution with power-law exponent $\tau \in (2, 3)$. Based on that we additionally derive insights about the connectivity of certain induced subgraphs of GIRGs.
Seven years ago, researchers proposed a postprocessing method to equalize the error rates of a model across different demographic groups. The work launched hundreds of papers purporting to improve over the postprocessing baseline. We empirically evaluate these claims through thousands of model evaluations on several tabular datasets. We find that the fairness-accuracy Pareto frontier achieved by postprocessing contains all other methods we were feasibly able to evaluate. In doing so, we address two common methodological errors that have confounded previous observations. One relates to the comparison of methods with different unconstrained base models. The other concerns methods achieving different levels of constraint relaxation. At the heart of our study is a simple idea we call unprocessing that roughly corresponds to the inverse of postprocessing. Unprocessing allows for a direct comparison of methods using different underlying models and levels of relaxation. Interpreting our findings, we recall a widely overlooked theoretical argument, present seven years ago, that accurately predicted what we observe.
Classical game theory is a powerful tool focusing on optimized resource distribution, allocation and sharing in classical wired and wireless networks. As quantum networks are emerging as a means of providing true connectivity between quantum computers, it is imperative and crucial to exploit game theory for addressing challenges like entanglement distribution and access, routing, topology extraction and inference for quantum networks. Quantum networks provide the promising opportunity of employing quantum games owing to their inherent capability of generating and sharing quantum states. Besides, quantum games offer enhanced payoffs and winning probabilities, new strategies and equilibria, which are unimaginable in classical games. Employing quantum game theory to solve fundamental challenges in quantum networks opens a new fundamental research direction necessitating inter-disciplinary efforts. In this article, we introduce a novel game-theoretical framework for exploiting quantum strategies to solve, as archetypal example, one of the key functionality of a quantum network, namely, the entanglement distribution. We compare the quantum strategies with classical ones by showing the quantum advantages in terms of link fidelity improvement and latency decrease in communication. In future, we will generalize our game framework to optimize entanglement distribution and access over any quantum network topology. We will also explore how quantum games can be leveraged to address other challenges like routing, optimization of quantum operations and topology design.
We study a variant of the widely popular, fast and often used family of community detection procedures referred to as label propagation algorithms. These mechanisms also exhibit many parallels with models of opinion exchange dynamics and consensus mechanisms in distributed computing. Initially, given a network, each vertex starts with a random label in the interval $[0,1]$. Then, in each round of the algorithm, every vertex switches its label to the majority label in its neighborhood (including its own label). At the first round, ties are broken towards smaller labels, while at each of the next rounds, ties are broken uniformly at random. We investigate the performance of this algorithm on the binomial random graph $\mathcal{G}(n,p)$. We show that for $np \ge n^{5/8+\varepsilon}$, the algorithm terminates with a single label a.a.s. (which was previously known only for $np\ge n^{3/4+\varepsilon}$). Moreover, we show that if $np\gg n^{2/3}$, a.a.s.\ this label is the smallest one, whereas if $n^{5/8+\varepsilon}\le np\ll n^{2/3}$, the surviving label is a.a.s. not the smallest one.
We study the Electrical Impedance Tomography Bayesian inverse problem for recovering the conductivity given noisy measurements of the voltage on some boundary surface electrodes. The uncertain conductivity depends linearly on a countable number of uniformly distributed random parameters in a compact interval, with the coefficient functions in the linear expansion decaying at an algebraic rate. We analyze the surrogate Markov Chain Monte Carlo (MCMC) approach for sampling the posterior probability measure, where the multivariate sparse adaptive interpolation, with interpolating points chosen according to a lower index set, is used for approximating the forward map. The forward equation is approximated once before running the MCMC for all the realizations, using interpolation on the finite element (FE) approximation at the parametric interpolating points. When evaluation of the solution is needed for a realization, we only need to compute a polynomial, thus cutting drastically the computation time. We contribute a rigorous error estimate for the MCMC convergence. In particular, we show that there is a nested sequence of interpolating lower index sets for which we can derive an interpolation error estimate in terms of the cardinality of these sets, uniformly for all the parameter realizations. An explicit convergence rate for the MCMC sampling of the posterior expectation of the conductivity is rigorously derived, in terms of the interpolating point number, the accuracy of the FE approximation of the forward equation, and the MCMC sample number. We perform numerical experiments using an adaptive greedy approach to construct the sets of interpolation points. We show the benefits of this approach over the simple MCMC where the forward equation is repeatedly solved for all the samples and the non-adaptive surrogate MCMC with an isotropic index set treating all the random parameters equally.
The vehicle routing problem with time windows (VRPTW) is a classic optimization problem that arises in many different areas, such as logistics and transportation. The goal of the VRPTW is to find the shortest possible route for a fleet of vehicles to visit a set of destinations. In recent years, there has been growing interest in using variational quantum algorithms (VQAs), to find approximate solutions to problems that can be formulated as quadratic unconstrained binary optimization (QUBO) problems. In this work, we formulate the VRPTW as a QUBO and apply a quantum variational approach to the VRPTW using our earlier suggested encoding scheme described in [1] to reduce drastically the number of qubits required. We evaluate our approach on a set of VRPTW instances ranging from 11 to 3964 routes constructed with data provided by researchers from ExxonMobil. We compare the solutions obtained with standard full encoding approaches for which the max problems size possible in NISQ era are of the order of 20-30 routes. We run our algorithms in simulators as well as cloud quantum hardware provided by IBMQ, AWS (Rigetti) and IonQ and benchmark our results against each other as well as on the simulators. We show that our approach can find approximate solutions to the VRPTW that are comparable to the solutions found by quantum algorithms using the full encoding. Our results suggest that our unique encoding approach, provides a promising approach to drastically reducing the number of qubits required to find decent approximate solutions for industry-based optimization problems.
Piecewise constant priors are routinely used in the Bayesian Cox proportional hazards model for survival analysis. Despite its popularity, large sample properties of this Bayesian method are not yet well understood. This work provides a unified theory for posterior distributions in this setting, not requiring the priors to be conjugate. We first derive contraction rate results for wide classes of histogram priors on the unknown hazard function and prove asymptotic normality of linear functionals of the posterior hazard in the form of Bernstein--von Mises theorems. Second, using recently developed multiscale techniques, we derive functional limiting results for the cumulative hazard and survival function. Frequentist coverage properties of Bayesian credible sets are investigated: we prove that certain easily computable credible bands for the survival function are optimal frequentist confidence bands. We conduct simulation studies that confirm these predictions, with an excellent behavior particularly in finite samples. Our results suggest that the Bayesian approach can provide an easy solution to obtain both the coefficients estimate and the credible bands for survival function in practice.
The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.