The unextendibility or monogamy of entangled states is a key property of quantum entanglement. Unlike conventional ways of expressing entanglement monogamy via entanglement measure inequalities, we introduce a state-dependent set of free states to quantify the unextendibility of a bipartite quantum state. First, we define a family of entanglement measures called \textit{unextendible entanglement}. Given a bipartite state $\rho_{AB}$, the key idea behind these measures is to minimize a divergence between $\rho_{AB}$ and any possible reduced state $\rho_{AB'}$ of an extension $\rho_{ABB'}$ of $\rho_{AB}$. These measures are intuitively motivated by the fact that the more that a bipartite state is entangled, the less that each of its individual systems can be entangled with a third party. Second, we show that the unextendible entanglement is an entanglement monotone under two-extendible operations, which include local operations and one-way classical communication as a special case. Unextendible entanglement has several other desirable properties, including normalization and faithfulness. As practical applications, we show that the unextendible entanglement provides efficiently computable benchmarks for the rate of exact secret key distillation and entanglement distillation and the overhead of probabilistic secret key or entanglement distillation.
We leverage state-of-the-art machine learning methods and a decade's worth of archival data from CFHT to predict observatory image quality (IQ) from environmental conditions and observatory operating parameters. Specifically, we develop accurate and interpretable models of the complex dependence between data features and observed IQ for CFHT's wide-field camera, MegaCam. Our contributions are several-fold. First, we collect, collate and reprocess several disparate data sets gathered by CFHT scientists. Second, we predict probability distribution functions (PDFs) of IQ and achieve a mean absolute error of $\sim0.07''$ for the predicted medians. Third, we explore the data-driven actuation of the 12 dome "vents" installed in 2013-14 to accelerate the flushing of hot air from the dome. We leverage epistemic and aleatoric uncertainties in conjunction with probabilistic generative modeling to identify candidate vent adjustments that are in-distribution (ID); for the optimal configuration for each ID sample, we predict the reduction in required observing time to achieve a fixed SNR. On average, the reduction is $\sim12\%$. Finally, we rank input features by their Shapley values to identify the most predictive variables for each observation. Our long-term goal is to construct reliable and real-time models that can forecast optimal observatory operating parameters to optimize IQ. We can then feed such forecasts into scheduling protocols and predictive maintenance routines. We anticipate that such approaches will become standard in automating observatory operations and maintenance by the time CFHT's successor, the Maunakea Spectroscopic Explorer, is installed in the next decade.
As one of the most important function in quantum networks, entanglement routing, i.e., how to efficiently establish remote entanglement connection between two arbitrary quantum nodes, becomes a critical problem that is worth to be studied. However, the entanglement fidelity, which can be regarded as the most important metric to evaluate the quality of connection, is rarely considered in existing works. Thus, in this paper, we propose purification-enabled entanglement routing designs to provide fidelity guarantee for multiple Source-Destination (S-D) pairs in quantum networks. To find the routing path with minimum entangled pair cost, we first design an iterative routing algorithm for single S-D pair, called Q-PATH, to find the optimal solution. After that, due to the relatively high computational complexity, we also design a low-complexity routing algorithm by using an extended dijkstra algorithm, called Q-LEAP, to efficiently find the near-optimal solution. Based on these two algorithms, we design a utility metric to solve the resource allocation problem for multiple S-D pairs, and further design a greedy-based algorithm considering resource allocation and re-routing process for routing requests from multiple S-D pairs. To verify the effectiveness and superiority of the proposed algorithms, extensive simulations are conducted compared to the existing purification-enabled routing algorithm. The simulation results show that, compared with the traditional routing scheme, the proposed algorithms not only can provide fidelity-guaranteed routing solutions under various scenarios, but also has superior performance in terms of throughput, fidelity of end-to-end entanglement connection, and resource utilization ratio.
To understand how deep learning works, it is crucial to understand the training dynamics of neural networks. Several interesting hypotheses about these dynamics have been made based on empirically observed phenomena, but there exists a limited theoretical understanding of when and why such phenomena occur. In this paper, we consider the training dynamics of gradient flow on kernel least-squares objectives, which is a limiting dynamics of SGD trained neural networks. Using precise high-dimensional asymptotics, we characterize the dynamics of the fitted model in two "worlds": in the Oracle World the model is trained on the population distribution and in the Empirical World the model is trained on a sampled dataset. We show that under mild conditions on the kernel and $L^2$ target regression function the training dynamics undergo three stages characterized by the behaviors of the models in the two worlds. Our theoretical results also mathematically formalize some interesting deep learning phenomena. Specifically, in our setting we show that SGD progressively learns more complex functions and that there is a "deep bootstrap" phenomenon: during the second stage, the test error of both worlds remain close despite the empirical training error being much smaller. Finally, we give a concrete example comparing the dynamics of two different kernels which shows that faster training is not necessary for better generalization.
State-of-the-art probabilistic model checkers perform verification on explicit-state Markov models defined in a high-level programming formalism like the PRISM modeling language. Typically, the low-level models resulting from such program-like specifications exhibit lots of structure such as repeating subpatterns. Established techniques like probabilistic bisimulation minimization are able to exploit these structures; however, they operate directly on the explicit-state model. On the other hand, methods for reducing structured state spaces by reasoning about the high-level program have not been investigated that much. In this paper, we present a new, simple, and fully automatic program-level technique to reduce the underlying Markov model. Our approach aims at computing the summary behavior of adjacent locations in the program's control-flow graph, thereby obtaining a program with fewer "control states". This reduction is immediately reflected in the program's operational semantics, enabling more efficient model checking. A key insight is that in principle, each (combination of) program variable(s) with finite domain can play the role of the program counter that defines the flow structure. Unlike most other reduction techniques, our approach is property-directed and naturally supports unspecified model parameters. Experiments demonstrate that our simple method yields state-space reductions of up to 80% on practically relevant benchmarks.
We prove several new tight distributed lower bounds for classic symmetry breaking graph problems. As a basic tool, we first provide a new insightful proof that any deterministic distributed algorithm that computes a $\Delta$-coloring on $\Delta$-regular trees requires $\Omega(\log_\Delta n)$ rounds and any randomized algorithm requires $\Omega(\log_\Delta\log n)$ rounds. We prove this result by showing that a natural relaxation of the $\Delta$-coloring problem is a fixed point in the round elimination framework. As a first application, we show that our $\Delta$-coloring lower bound proof directly extends to arbdefective colorings. We exactly characterize which variants of the arbdefective coloring problem are "easy", and which of them instead are "hard". As a second application, which we see as our main contribution, we use the structure of the fixed point as a building block to prove lower bounds as a function of $\Delta$ for a large class of distributed symmetry breaking problems. For example, we obtain a tight lower bound for the fundamental problem of computing a $(2,\beta)$-ruling set. This is an exponential improvement over the best existing lower bound for the problem, which was proven in [FOCS '20]. Our lower bound even applies to a much more general family of problems that allows for almost arbitrary combinations of natural constraints from coloring problems, orientation problems, and independent set problems, and provides a single unified proof for known and new lower bound results for these types of problems. Our lower bounds as a function of $\Delta$ also imply lower bounds as a function of $n$. We obtain, for example, that maximal independent set, on trees, requires $\Omega(\log n / \log \log n)$ rounds for deterministic algorithms, which is tight.
Understanding the inner workings of deep neural networks (DNNs) is essential to provide trustworthy artificial intelligence techniques for practical applications. Existing studies typically involve linking semantic concepts to units or layers of DNNs, but fail to explain the inference process. In this paper, we introduce neural architecture disentanglement (NAD) to fill the gap. Specifically, NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. We investigate whether, where, and how the disentanglement occurs through experiments conducted with handcrafted and automatically-searched network architectures, on both object-based and scene-based datasets. Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs. First, DNNs can be divided into sub-architectures for independent tasks. Second, deeper layers do not always correspond to higher semantics. Third, the connection type in a DNN affects how the information flows across layers, leading to different disentanglement behaviors. With NAD, we further explain why DNNs sometimes give wrong predictions. Experimental results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones. Code will be available at: //github.com/hujiecpp/NAD.
We propose a novel approach to disentangle the generative factors of variation underlying a given set of observations. Our method builds upon the idea that the (unknown) low-dimensional manifold underlying the data space can be explicitly modeled as a product of submanifolds. This gives rise to a new definition of disentanglement, and to a novel weakly-supervised algorithm for recovering the unknown explanatory factors behind the data. At training time, our algorithm only requires pairs of non i.i.d. data samples whose elements share at least one, possibly multidimensional, generative factor of variation. We require no knowledge on the nature of these transformations, and do not make any limiting assumption on the properties of each subspace. Our approach is easy to implement, and can be successfully applied to different kinds of data (from images to 3D surfaces) undergoing arbitrary transformations. In addition to standard synthetic benchmarks, we showcase our method in challenging real-world applications, where we compare favorably with the state of the art.
Recent studies have shown the vulnerability of reinforcement learning (RL) models in noisy settings. The sources of noises differ across scenarios. For instance, in practice, the observed reward channel is often subject to noise (e.g., when observed rewards are collected through sensors), and thus observed rewards may not be credible as a result. Also, in applications such as robotics, a deep reinforcement learning (DRL) algorithm can be manipulated to produce arbitrary errors. In this paper, we consider noisy RL problems where observed rewards by RL agents are generated with a reward confusion matrix. We call such observed rewards as perturbed rewards. We develop an unbiased reward estimator aided robust RL framework that enables RL agents to learn in noisy environments while observing only perturbed rewards. Our framework draws upon approaches for supervised learning with noisy data. The core ideas of our solution include estimating a reward confusion matrix and defining a set of unbiased surrogate rewards. We prove the convergence and sample complexity of our approach. Extensive experiments on different DRL platforms show that policies based on our estimated surrogate reward can achieve higher expected rewards, and converge faster than existing baselines. For instance, the state-of-the-art PPO algorithm is able to obtain 67.5% and 46.7% improvements in average on five Atari games, when the error rates are 10% and 30% respectively.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.
Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information. In this work, we aim at generating such images based on a novel, two-stage reconstruction pipeline that learns a disentangled representation of the aforementioned image factors and generates novel person images at the same time. First, a multi-branched reconstruction network is proposed to disentangle and encode the three factors into embedding features, which are then combined to re-compose the input image itself. Second, three corresponding mapping functions are learned in an adversarial manner in order to map Gaussian noise to the learned embedding feature space, for each factor respectively. Using the proposed framework, we can manipulate the foreground, background and pose of the input image, and also sample new embedding features to generate such targeted manipulations, that provide more control over the generation process. Experiments on Market-1501 and Deepfashion datasets show that our model does not only generate realistic person images with new foregrounds, backgrounds and poses, but also manipulates the generated factors and interpolates the in-between states. Another set of experiments on Market-1501 shows that our model can also be beneficial for the person re-identification task.