We study the performance of medium-length quantum LDPC (QLDPC) codes in the depolarizing channel. Only degenerate codes with the maximal stabilizer weight much smaller than their minimum distance are considered. It is shown that with the help of OSD-like post-processing the performance of the standard belief propagation (BP) decoder on many QLDPC codes can be improved by several orders of magnitude. Using this new BP-OSD decoder we study the performance of several known classes of degenerate QLDPC codes including hypergraph product codes, hyperbicycle codes, homological product codes, and Haah's cubic codes. We also construct several interesting examples of short generalized bicycle codes. Some of them have an additional property that their syndromes are protected by small BCH codes, which may be useful for the fault-tolerant syndrome measurement. We also propose a new large family of QLDPC codes that contains the class of hypergraph product codes, where one of the used parity-check matrices is square. It is shown that in some cases such codes have better performance than hypergraph product codes. Finally, we demonstrate that the performance of the proposed BP-OSD decoder for some of the constructed codes is better than for a relatively large surface code decoded by a near-optimal decoder.
We introduce a computational origami problem which we call the segment folding problem: given a set of $n$ line-segments in the plane the aim is to make creases along all segments in the minimum number of folding steps. Note that a folding might alter the relative position between the segments, and a segment could split into two. We show that it is NP-hard to determine whether $n$ line segments can be folded in $n$ simple folding operations.
In order to meet mobile cellular users' ever-increasing data demands, today's 4G and 5G networks are designed mainly with the goal of maximizing spectral efficiency. While they have made progress in this regard, controlling the carbon footprint and operational costs of such networks remains a long-standing problem among network designers. This paper takes a long view on this problem, envisioning a NextG scenario where the network leverages quantum annealing for cellular baseband processing. We gather and synthesize insights on power consumption, computational throughput and latency, spectral efficiency, operational cost, and feasibility timelines surrounding quantum technology. Armed with these data, we analyze and project the quantitative performance targets future quantum annealing hardware must meet in order to provide a computational and power advantage over CMOS hardware, while matching its whole-network spectral efficiency. Our quantitative analysis predicts that with quantum annealing hardware operating at a 102 $\mu$s problem latency and 3.1M qubits, quantum annealing will achieve a spectral efficiency equal to CMOS computation while reducing power consumption by 41 kW (45% lower) in a representative 5G base station scenario with 400 MHz bandwidth and 64 antennas, and an 8 kW power reduction (16% lower) using 1.5M qubits in a 200 MHz-bandwidth 5G scenario.
A connected dominating set is a widely adopted model for the virtual backbone of a wireless sensor network. In this paper, we design an evolutionary algorithm for the minimum connected dominating set problem (MinCDS), whose performance is theoretically guaranteed in terms of both computation time and approximation ratio. Given a connected graph $G=(V,E)$, a connected dominating set (CDS) is a subset $C\subseteq V$ such that every vertex in $V\setminus C$ has a neighbor in $C$, and the subgraph of $G$ induced by $C$ is connected. The goal of MinCDS is to find a CDS of $G$ with the minimum cardinality. We show that our evolutionary algorithm can find a CDS in expected $O(n^3)$ time which approximates the optimal value within factor $(2+\ln\Delta)$, where $n$ and $\Delta$ are the number of vertices and the maximum degree of graph $G$, respectively.
This paper is concerned with the asymptotic distribution of the largest eigenvalues for some nonlinear random matrix ensemble stemming from the study of neural networks. More precisely we consider $M= \frac{1}{m} YY^\top$ with $Y=f(WX)$ where $W$ and $X$ are random rectangular matrices with i.i.d. centered entries. This models the data covariance matrix or the Conjugate Kernel of a single layered random Feed-Forward Neural Network. The function $f$ is applied entrywise and can be seen as the activation function of the neural network. We show that the largest eigenvalue has the same limit (in probability) as that of some well-known linear random matrix ensembles. In particular, we relate the asymptotic limit of the largest eigenvalue for the nonlinear model to that of an information-plus-noise random matrix, establishing a possible phase transition depending on the function $f$ and the distribution of $W$ and $X$. This may be of interest for applications to machine learning.
In this paper we look at the problem of adjacency labeling of graphs. Given a family of undirected graphs the problem is to determine an encoding-decoding scheme for each member of the family such that we can decode the adjacency information of any pair of vertices only from their encoded labels. Further, we want the length of each label to be short (logarithmic in $n$, the number of vertices) and the encoding-decoding scheme to be computationally efficient. We proposed a simple tree-decomposition based encoding scheme and used it give an adjacency labeling of size $O(k \log k \log n)$-bits. Here $k$ is the clique-width of the graph family. We also extend the result to a certain family of $k$-probe graphs.
In this study, we analyze the efficiency of a protocol with discrete modulation of continuous variable non-Gaussian states, the coherent states having one photon added and then one photon subtracted (PASCS). We calculate the secure key generation rate against collective attacks using the fact that Eve's information can be bounded based on the protocol with Gaussian modulation, which in turn is unconditionally secure. Our results for a four-state protocol show that the PASCS always outperforms the equivalent coherent states protocol under the same environmental conditions. Interestingly, we find that for the protocol using discrete-modulated PASCS, the noisier the line, the better will be its performance compared to the protocol using coherent states. Thus, our proposal proves to be advantageous for performing quantum key distribution in non-ideal situations.
Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a method to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.
While Generative Adversarial Networks (GANs) have empirically produced impressive results on learning complex real-world distributions, recent work has shown that they suffer from lack of diversity or mode collapse. The theoretical work of Arora et al.~\cite{AroraGeLiMaZh17} suggests a dilemma about GANs' statistical properties: powerful discriminators cause overfitting, whereas weak discriminators cannot detect mode collapse. In contrast, we show in this paper that GANs can in principle learn distributions in Wasserstein distance (or KL-divergence in many cases) with polynomial sample complexity, if the discriminator class has strong distinguishing power against the particular generator class (instead of against all possible generators). For various generator classes such as mixture of Gaussians, exponential families, and invertible neural networks generators, we design corresponding discriminators (which are often neural nets of specific architectures) such that the Integral Probability Metric (IPM) induced by the discriminators can provably approximate the Wasserstein distance and/or KL-divergence. This implies that if the training is successful, then the learned distribution is close to the true distribution in Wasserstein distance or KL divergence, and thus cannot drop modes. Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence, indicating that the lack of diversity may be caused by the sub-optimality in optimization instead of statistical inefficiency.
Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.