Neutral atoms are a promising choice for scalable quantum computing architectures. Features such as long distance interactions and native multiqubit gates offer reductions in communication costs and operation count. However, the trapped atoms used as qubits can be lost over the course of computation and due to adverse environmental factors. The value of a lost computation qubit cannot be recovered and requires the reloading of the array and rerunning of the computation, greatly increasing the number of runs of a circuit. Software mitigation strategies exist but exhaust the original mapped locations of the circuit slowly and create more spread out clusters of qubits across the architecture decreasing the probability of success. We increase flexibility by developing strategies that find all reachable qubits, rather only adjacent hardware qubits. Second, we divide the architecture into separate sections, and run the circuit in each section, free of lost atoms. Provided the architecture is large enough, this resets the circuit without having to reload the entire architecture. This increases the number of effective shots before reloading by a factor of two for a circuit that utilizes 30% of the architecture. We also explore using these sections to parallelize execution of circuits, reducing the overall runtime by a total 50% for 30 qubit circuit. These techniques contribute to a dynamic new set of strategies to combat the detrimental effects of lost computational space.
In turbulence modeling, and more particularly in the Large-Eddy Simulation (LES) framework, we are concerned with finding closure models that represent the effect of the unresolved subgrid-scales on the resolved scales. Recent approaches gravitate towards machine learning techniques to construct such models. However, the stability of machine-learned closure models and their abidance by physical structure (e.g. symmetries, conservation laws) are still open problems. To tackle both issues, we take the `discretize first, filter next' approach, in which we apply a spatial averaging filter to existing energy-conserving (fine-grid) discretizations. The main novelty is that we extend the system of equations describing the filtered solution with a set of equations that describe the evolution of (a compressed version of) the energy of the subgrid-scales. Having an estimate of this energy, we can use the concept of energy conservation and derive stability. The compressed variables are determined via a data-driven technique in such a way that the energy of the subgrid-scales is matched. For the extended system, the closure model should be energy-conserving, and a new skew-symmetric convolutional neural network architecture is proposed that has this property. Stability is thus guaranteed, independent of the actual weights and biases of the network. Importantly, our framework allows energy exchange between resolved scales and compressed subgrid scales and thus enables backscatter. To model dissipative systems (e.g. viscous flows), the framework is extended with a diffusive component. The introduced neural network architecture is constructed such that it also satisfies momentum conservation. We apply the new methodology to both the viscous Burgers' equation and the Korteweg-De Vries equation in 1D and show superior stability properties when compared to a vanilla convolutional neural network.
In the quantum computation verification problem, a quantum server wants to convince a client that the output of evaluating a quantum circuit $C$ is some result that it claims. This problem is considered very important both theoretically and practically in quantum computation [arXiv:1709.06984], [arXiv:1704.04487], [arXiv:1209.0449]. The client is considered to be limited in computational power, and one desirable property is that the client can be completely classical, which leads to the classical verification of quantum computation (CVQC) problem. In terms of the total time complexity, the fastest single-server CVQC protocol so far has complexity $O(poly(\kappa)|C|^3)$ where $|C|$ is the size of the circuit to be verified and $\kappa$ is the security parameter, given by Mahadev [arXiv:1804.01082]. In this work, by developing new techniques, we give a new CVQC protocol with complexity $O(poly(\kappa)|C|)$, which is significantly faster than existing protocols. Our protocol is secure in the quantum random oracle model [arXiv:1008.0931] assuming the existence of noisy trapdoor claw-free functions [arXiv:1804.00640], which are both extensively used assumptions in quantum cryptography. Along the way, we also give a new classical channel remote state preparation protocol for states in $\{|+_\theta\rangle=\frac{1}{\sqrt{2}}(|0\rangle+e^{i\theta\pi/4}|1\rangle):\theta\in \{0,1\cdots 7\}\}$, another basic primitive in quantum cryptography. Our protocol allows for parallel verifiable preparation of $L$ independently random states in this form (up to a constant overall error and a possibly unbounded server-side simulator), and runs in only $O(poly(\kappa)L)$ time and constant rounds; for comparison, existing works (even for possibly simpler state families) all require very large or unestimated time and round complexities [arXiv:1904.06320][arXiv:1904.06303][arXiv:2201.13445][arXiv:2201.13430].
Near-term quantum computers provide a promising platform for finding ground states of quantum systems, which is an essential task in physics, chemistry, and materials science. Near-term approaches, however, are constrained by the effects of noise as well as the limited resources of near-term quantum hardware. We introduce "neural error mitigation," which uses neural networks to improve estimates of ground states and ground-state observables obtained using near-term quantum simulations. To demonstrate our method's broad applicability, we employ neural error mitigation to find the ground states of the H$_2$ and LiH molecular Hamiltonians, as well as the lattice Schwinger model, prepared via the variational quantum eigensolver (VQE). Our results show that neural error mitigation improves numerical and experimental VQE computations to yield low energy errors, high fidelities, and accurate estimations of more-complex observables like order parameters and entanglement entropy, without requiring additional quantum resources. Furthermore, neural error mitigation is agnostic with respect to the quantum state preparation algorithm used, the quantum hardware it is implemented on, and the particular noise channel affecting the experiment, contributing to its versatility as a tool for quantum simulation.
Precision radial velocity (RV) measurements continue to be a key tool to detect and characterise extrasolar planets. While instrumental precision keeps improving, stellar activity remains a barrier to obtain reliable measurements below 1-2 m/s accuracy. Using simulations and real data, we investigate the capabilities of a Deep Neural Network approach to produce activity free Doppler measurements of stars. As case studies we use observations of two known stars (Eps Eridani and AUMicroscopii), both with clear signals of activity induced RV variability. Synthetic data using the starsim code are generated for the observables (inputs) and the resulting RV signal (labels), and used to train a Deep Neural Network algorithm. We identify an architecture consisting of convolutional and fully connected layers that is adequate to the task. The indices investigated are mean line-profile parameters (width, bisector, contrast) and multi-band photometry. We demonstrate that the RV-independent approach can drastically reduce spurious Doppler variability from known physical effects such as spots, rotation and convective blueshift. We identify the combinations of activity indices with most predictive power. When applied to real observations, we observe a good match of the correction with the observed variability, but we also find that the noise reduction is not as good as in the simulations, probably due to the lack of detail in the simulated physics. We demonstrate that a model-driven machine learning approach is sufficient to clean Doppler signals from activity induced variability for well known physical effects. There are dozens of known activity related observables whose inversion power remains unexplored indicating that the use of additional indicators, more complete models, and more observations with optimised sampling strategies can lead to significant improvements in our detrending capabilities.
The causal set and Wolfram model approaches to discrete quantum gravity both permit the formulation of a manifestly covariant notion of entanglement entropy for quantum fields. In the causal set case, this is given by a construction (due to Sorkin and Johnston) of a 2-point correlation function for a Gaussian scalar field from causal set Feynman propagators and Pauli-Jordan functions, from which an eigendecomposition, and hence an entanglement entropy, can be computed. In the Wolfram model case, it is given instead in terms of the Fubini-Study metric on branchial graphs, whose tensor product structure is inherited functorially from that of finite-dimensional Hilbert spaces. In both cases, the entanglement entropies in question are most naturally defined over an extended spacetime region (hence the manifest covariance), in contrast to the generically non-covariant definitions over single spacelike hypersurfaces common to most continuum quantum field theories. In this article, we show how an axiomatic field theory for a free, massless scalar field (obeying the appropriate bosonic commutation relations) may be rigorously constructed over multiway causal graphs: a combinatorial structure sufficiently general as to encompass both causal sets and Wolfram model evolutions as special cases. We proceed to show numerically that the entanglement entropies computed using both the Sorkin-Johnston approach and the branchial graph approach are monotonically related for a large class of Wolfram model evolution rules. We also prove a special case of this monotonic relationship using a recent geometrical entanglement monotone proposed by Cocchiarella et al.
Predicting how different interventions will causally affect a specific individual is important in a variety of domains such as personalized medicine, public policy, and online marketing. However, most existing causal methods cannot generalize to predicting the effects of previously unseen interventions (e.g., a newly invented drug), because they require data for individuals who received the intervention. Here, we consider zero-shot causal learning: predicting the personalized effects of novel, previously unseen interventions. To tackle this problem, we propose CaML, a causal meta-learning framework which formulates the personalized prediction of each intervention's effect as a task. Rather than training a separate model for each intervention, CaML trains as a single meta-model across thousands of tasks, each constructed by sampling an intervention and individuals who either did or did not receive it. By leveraging both intervention information (e.g., a drug's attributes) and individual features (e.g., a patient's history), CaML is able to predict the personalized effects of unseen interventions. Experimental results on real world datasets in large-scale medical claims and cell-line perturbations demonstrate the effectiveness of our approach. Most strikingly, CaML zero-shot predictions outperform even strong baselines which have direct access to data of considered target interventions.
Quantum error correction codes (QECC) are a key component for realizing the potential of quantum computing. QECC, as its classical counterpart (ECC), enables the reduction of error rates, by distributing quantum logical information across redundant physical qubits, such that errors can be detected and corrected. In this work, we efficiently train novel deep quantum error decoders. We resolve the quantum measurement collapse by augmenting syndrome decoding to predict an initial estimate of the system noise, which is then refined iteratively through a deep neural network. The logical error rates calculated over finite fields are directly optimized via a differentiable objective, enabling efficient decoding under the constraints imposed by the code. Finally, our architecture is extended to support faulty syndrome measurement, to allow efficient decoding over repeated syndrome sampling. The proposed method demonstrates the power of neural decoders for QECC by achieving state-of-the-art accuracy, outperforming, for a broad range of topological codes, the existing neural and classical decoders, which are often computationally prohibitive.
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at //github.com/BayesWatch/nas-without-training.
Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.
Reinforcement learning is one of the core components in designing an artificial intelligent system emphasizing real-time response. Reinforcement learning influences the system to take actions within an arbitrary environment either having previous knowledge about the environment model or not. In this paper, we present a comprehensive study on Reinforcement Learning focusing on various dimensions including challenges, the recent development of different state-of-the-art techniques, and future directions. The fundamental objective of this paper is to provide a framework for the presentation of available methods of reinforcement learning that is informative enough and simple to follow for the new researchers and academics in this domain considering the latest concerns. First, we illustrated the core techniques of reinforcement learning in an easily understandable and comparable way. Finally, we analyzed and depicted the recent developments in reinforcement learning approaches. My analysis pointed out that most of the models focused on tuning policy values rather than tuning other things in a particular state of reasoning.