The emergence of quantum computers as a new computational paradigm has been accompanied by speculation concerning the scope and timeline of their anticipated revolutionary changes. While quantum computing is still in its infancy, the variety of different architectures used to implement quantum computations make it difficult to reliably measure and compare performance. This problem motivates our introduction of SupermarQ, a scalable, hardware-agnostic quantum benchmark suite which uses application-level metrics to measure performance. SupermarQ is the first attempt to systematically apply techniques from classical benchmarking methodology to the quantum domain. We define a set of feature vectors to quantify coverage, select applications from a variety of domains to ensure the suite is representative of real workloads, and collect benchmark results from the IBM, IonQ, and AQT@LBNL platforms. Looking forward, we envision that quantum benchmarking will encompass a large cross-community effort built on open source, constantly evolving benchmark suites. We introduce SupermarQ as an important step in this direction.
In the coming years, quantum networks will allow quantum applications to thrive thanks to the new opportunities offered by end-to-end entanglement of qubits on remote hosts via quantum repeaters. On a geographical scale, this will lead to the dawn of the Quantum Internet. While a full-blown deployment is yet to come, the research community is already working on a variety of individual enabling technologies and solutions. In this paper, with the guidance of extensive simulations, we take a broader view and investigate the problems of Quality of Service (QoS) and provisioning in the context of quantum networks, which are very different from their counterparts in classical data networks due to some of their fundamental properties. Our work leads the way towards a new class of studies that will allow the research community to better understand the challenges of quantum networks and their potential commercial exploitation.
Common tasks encountered in epidemiology, including disease incidence estimation and causal inference, rely on predictive modeling. Constructing a predictive model can be thought of as learning a prediction function, i.e., a function that takes as input covariate data and outputs a predicted value. Many strategies for learning these functions from data are available, from parametric regressions to machine learning algorithms. It can be challenging to choose an approach, as it is impossible to know in advance which one is the most suitable for a particular dataset and prediction task at hand. The super learner (SL) is an algorithm that alleviates concerns over selecting the one "right" strategy while providing the freedom to consider many of them, such as those recommended by collaborators, used in related research, or specified by subject-matter experts. It is an entirely pre-specified and data-adaptive strategy for predictive modeling. To ensure the SL is well-specified for learning the prediction function, the analyst does need to make a few important choices. In this Education Corner article, we provide step-by-step guidelines for making these choices, walking the reader through each of them and providing intuition along the way. In doing so, we aim to empower the analyst to tailor the SL specification to their prediction task, thereby ensuring their SL performs as well as possible. A flowchart provides a concise, easy-to-follow summary of key suggestions and heuristics, based on our accumulated experience, and guided by theory.
Existing quantum compilers optimize quantum circuits by applying circuit transformations designed by experts. This approach requires significant manual effort to design and implement circuit transformations for different quantum devices, which use different gate sets, and can miss optimizations that are hard to find manually. We propose Quartz, a quantum circuit superoptimizer that automatically generates and verifies circuit transformations for arbitrary quantum gate sets. For a given gate set, Quartz generates candidate circuit transformations by systematically exploring small circuits and verifies the discovered transformations using an automated theorem prover. To optimize a quantum circuit, Quartz uses a cost-based backtracking search that applies the verified transformations to the circuit. Our evaluation on three popular gate sets shows that Quartz can effectively generate and verify transformations for different gate sets. The generated transformations cover manually designed transformations used by existing optimizers and also include new transformations. Quartz is therefore able to optimize a broad range of circuits for diverse gate sets, outperforming or matching the performance of hand-tuned circuit optimizers.
In this work a quantum analogue of Bayesian statistical inference is considered. Based on the notion of instrument, we propose a sequential measurement scheme from which observations needed for statistical inference are obtained. We further put forward a quantum analogue of Bayes rule, which states how the prior normal state of a quantum system updates under those observations. We next generalize the fundamental notions and results of Bayesian statistics according to the quantum Bayes rule. It is also note that our theory retains the classical one as its special case. Finally, we investigate the limit of posterior normal state as the number of observations tends to infinity.
Generating a test suite for a quantum program such that it has the maximum number of failing tests is an optimization problem. For such optimization, search-based testing has shown promising results in the context of classical programs. To this end, we present a test generation tool for quantum programs based on a genetic algorithm, called QuSBT (Search-based Testing of Quantum Programs). QuSBT automates the testing of quantum programs, with the aim of finding a test suite having the maximum number of failing test cases. QuSBT utilizes IBM's Qiskit as the simulation framework for quantum programs. We present the tool architecture in addition to the implemented methodology (i.e., the encoding of the search individual, the definition of the fitness function expressing the search problem, and the test assessment w.r.t. two types of failures). Finally, we report results of the experiments in which we tested a set of faulty quantum programs with QuSBT to assess its effectiveness. Repository (code and experimental results): //github.com/Simula-COMPLEX/qusbt-tool Video: //youtu.be/3apRCtluAn4
In today's computing environment, where Artificial Intelligence (AI) and data processing are moving toward the Internet of Things (IoT) and the Edge computing paradigm, benchmarking resource-constrained devices is a critical task to evaluate their suitability and performance. The literature has extensively explored the performance of IoT devices when running high-level benchmarks specialized in particular application scenarios, such as AI or medical applications. However, lower-level benchmarking applications and datasets that analyze the hardware components of each device are needed. This low-level device understanding enables new AI solutions for network, system and service management based on device performance, such as individual device identification, so it is an area worth exploring more in detail. In this paper, we present LwHBench, a low-level hardware benchmarking application for Single-Board Computers that measures the performance of CPU, GPU, Memory and Storage taking into account the component constraints in these types of devices. LwHBench has been implemented for Raspberry Pi devices and run for 100 days on a set of 45 devices to generate an extensive dataset that allows the usage of AI techniques in different application scenarios. Finally, to demonstrate the inter-scenario capability of the created dataset, a series of AI-enabled use cases about device identification and context impact on performance are presented as examples and exploration of the published data.
The paper presents a collection of analytical benchmark problems specifically selected to provide a set of stress tests for the assessment of multifidelity optimization methods. In addition, the paper discusses a comprehensive ensemble of metrics and criteria recommended for the rigorous and meaningful assessment of the performance of multifidelity strategies and algorithms.
It has long been observed that the performance of evolutionary algorithms and other randomized search heuristics can benefit from a non-static choice of the parameters that steer their optimization behavior. Mechanisms that identify suitable configurations on the fly ("parameter control") or via a dedicated training process ("dynamic algorithm configuration") are therefore an important component of modern evolutionary computation frameworks. Several approaches to address the dynamic parameter setting problem exist, but we barely understand which ones to prefer for which applications. As in classical benchmarking, problem collections with a known ground truth can offer very meaningful insights in this context. Unfortunately, settings with well-understood control policies are very rare. One of the few exceptions for which we know which parameter settings minimize the expected runtime is the LeadingOnes problem. We extend this benchmark by analyzing optimal control policies that can select the parameters only from a given portfolio of possible values. This also allows us to compute optimal parameter portfolios of a given size. We demonstrate the usefulness of our benchmarks by analyzing the behavior of the DDQN reinforcement learning approach for dynamic algorithm configuration.
Deep neural networks have been able to outperform humans in some cases like image recognition and image classification. However, with the emergence of various novel categories, the ability to continuously widen the learning capability of such networks from limited samples, still remains a challenge. Techniques like Meta-Learning and/or few-shot learning showed promising results, where they can learn or generalize to a novel category/task based on prior knowledge. In this paper, we perform a study of the existing few-shot meta-learning techniques in the computer vision domain based on their method and evaluation metrics. We provide a taxonomy for the techniques and categorize them as data-augmentation, embedding, optimization and semantics based learning for few-shot, one-shot and zero-shot settings. We then describe the seminal work done in each category and discuss their approach towards solving the predicament of learning from few samples. Lastly we provide a comparison of these techniques on the commonly used benchmark datasets: Omniglot, and MiniImagenet, along with a discussion towards the future direction of improving the performance of these techniques towards the final goal of outperforming humans.
In this paper we address issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets. In particular, annotation errors, the size of the dataset, and the level of challenge are addressed: new annotation for both datasets is created with an extra attention to the reliability of the ground truth. Three new protocols of varying difficulty are introduced. The protocols allow fair comparison between different methods, including those using a dataset pre-processing stage. For each dataset, 15 new challenging queries are introduced. Finally, a new set of 1M hard, semi-automatically cleaned distractors is selected. An extensive comparison of the state-of-the-art methods is performed on the new benchmark. Different types of methods are evaluated, ranging from local-feature-based to modern CNN based methods. The best results are achieved by taking the best of the two worlds. Most importantly, image retrieval appears far from being solved.