The past decade has seen considerable progress in quantum hardware in terms of the speed, number of qubits and quantum volume which is defined as the maximum size of a quantum circuit that can be effectively implemented on a near-term quantum device. Consequently, there has also been a rise in the number of works based on the applications of Quantum Machine Learning (QML) on real hardware to attain quantum advantage over their classical counterparts. In this survey, our primary focus is on selected supervised and unsupervised learning applications implemented on quantum hardware, specifically targeting real-world scenarios. Our survey explores and highlights the current limitations of QML implementations on quantum hardware. We delve into various techniques to overcome these limitations, such as encoding techniques, ansatz structure, error mitigation, and gradient methods. Additionally, we assess the performance of these QML implementations in comparison to their classical counterparts. Finally, we conclude our survey with a discussion on the existing bottlenecks associated with applying QML on real quantum devices and propose potential solutions for overcoming these challenges in the future.
Getting standard multigrid to work efficiently for the high-frequency Helmholtz equation has been an open problem in applied mathematics for years. Much effort has been dedicated to finding solution methods which can use multigrid components to obtain solvers with a linear time complexity. In this work we present one among the first stand-alone multigrid solvers for the 2D Helmholtz equation using both a constant and non-constant wavenumber model problem. We use standard smoothing techniques and do not impose any restrictions on the number of grid points per wavelength on the coarse-grid. As a result we are able to obtain a full V- and W-cycle algorithm. The key features of the algorithm are the use of higher-order inter-grid transfer operators combined with a complex constant in the coarsening process. Using weighted-Jacobi smoothing, we obtain a solver which is $h-$independent and scales linearly with the wavenumber $k$. Numerical results using 1 to 5 GMRES(3) smoothing steps approach $k-$ and $h-$ independent convergence, when combined with the higher-order inter-grid transfer operators and a small or even zero complex shift. The proposed algorithm provides an important step towards the perpetuating branch of research in finding scalable solvers for challenging wave propagation problems.
Whilst contrastive learning yields powerful representations by matching different augmented views of the same instance, it lacks the ability to capture the similarities between different instances. One popular way to address this limitation is by learning global features (after the global pooling) to capture inter-instance relationships based on knowledge distillation, where the global features of the teacher are used to guide the learning of the global features of the student. Inspired by cross-modality learning, we extend this existing framework that only learns from global features by encouraging the global features and intermediate layer features to learn from each other. This leads to our novel self-supervised framework: cross-context learning between global and hypercolumn features (CGH), that enforces the consistency of instance relations between low- and high-level semantics. Specifically, we stack the intermediate feature maps to construct a hypercolumn representation so that we can measure instance relations using two contexts (hypercolumn and global feature) separately, and then use the relations of one context to guide the learning of the other. This cross-context learning allows the model to learn from the differences between the two contexts. The experimental results on linear classification and downstream tasks show that our method outperforms the state-of-the-art methods.
Recently, there has been significant progress in the development of large models. Following the success of ChatGPT, numerous language models have been introduced, demonstrating remarkable performance. Similar advancements have also been observed in image generation models, such as Google's Imagen model, OpenAI's DALL-E 2, and stable diffusion models, which have exhibited impressive capabilities in generating images. However, similar to large language models, these models still encounter unresolved challenges. Fortunately, the availability of open-source stable diffusion models and their underlying mathematical principles has enabled the academic community to extensively analyze the performance of current image generation models and make improvements based on this stable diffusion framework. This survey aims to examine the existing issues and the current solutions pertaining to image generation models.
In the realm of Large Language Models, the balance between instruction data quality and quantity has become a focal point. Recognizing this, we introduce a self-guided methodology for LLMs to autonomously discern and select cherry samples from vast open-source datasets, effectively minimizing manual curation and potential cost for instruction tuning an LLM. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal tool to identify discrepancies between a model's expected responses and its autonomous generation prowess. Through the adept application of IFD, cherry samples are pinpointed, leading to a marked uptick in model training efficiency. Empirical validations on renowned datasets like Alpaca and WizardLM underpin our findings; with a mere 10% of conventional data input, our strategy showcases improved results. This synthesis of self-guided cherry-picking and the IFD metric signifies a transformative leap in the optimization of LLMs, promising both efficiency and resource-conscious advancements.
Honeypots play a crucial role in implementing various cyber deception techniques as they possess the capability to divert attackers away from valuable assets. Careful strategic placement of honeypots in networks should consider not only network aspects but also attackers' preferences. The allocation of honeypots in tactical networks under network mobility is of great interest. To achieve this objective, we present a game-theoretic approach that generates optimal honeypot allocation strategies within an attack/defense scenario. Our proposed approach takes into consideration the changes in network connectivity. In particular, we introduce a two-player dynamic game model that explicitly incorporates the future state evolution resulting from changes in network connectivity. The defender's objective is twofold: to maximize the likelihood of the attacker hitting a honeypot and to minimize the cost associated with deception and reconfiguration due to changes in network topology. We present an iterative algorithm to find Nash equilibrium strategies and analyze the scalability of the algorithm. Finally, we validate our approach and present numerical results based on simulations, demonstrating that our game model successfully enhances network security. Additionally, we have proposed additional enhancements to improve the scalability of the proposed approach.
When solving compressible multi-material flow problems, an unresolved challenge is the computation of advective fluxes across material interfaces that separate drastically different thermodynamic states and relations. A popular idea in this regard is to locally construct bimaterial Riemann problems, and to apply their exact solutions in flux computation. For general equations of state, however, finding the exact solution of a Riemann problem is expensive as it requires nested loops. Multiplied by the large number of Riemann problems constructed during a simulation, the computational cost often becomes prohibitive. The work presented in this paper aims to accelerate the solution of bimaterial Riemann problems without introducing approximations or offline precomputation tasks. The basic idea is to exploit some special properties of the Riemann problem equations, and to recycle previous solutions as much as possible. Following this idea, four acceleration methods are developed, including (1) a change of integration variable through rarefaction fans, (2) storing and reusing integration trajectory data, (3) step size adaptation, and (4) constructing an R-tree on the fly to generate initial guesses. The performance of these acceleration methods are assessed using four example problems in underwater explosion, laser-induced cavitation, and hypervelocity impact. These problems exhibit strong shock waves, large interface deformation, contact of multiple (>2) interfaces, and interaction between gases and condensed matters. In these challenging cases, the solution of bimaterial Riemann problems is accelerated by 37 to 87 times. As a result, the total cost of advective flux computation, which includes the exact Riemann problem solution at material interfaces and the numerical flux calculation over the entire computational domain, is accelerated by 18 to 81 times.
We characterize learnability for quantum measurement classes by establishing matching necessary and sufficient conditions for their PAC learnability, along with corresponding sample complexity bounds, in the setting where the learner is given access only to prepared quantum states. We first probe the results from previous works on this setting. We show that the empirical risk defined in previous works and matching the definition in the classical theory fails to satisfy the uniform convergence property enjoyed in the classical setting for some learnable classes. Moreover, we show that VC dimension generalization upper bounds in previous work are frequently infinite, even for finite-dimensional POVM classes. To surmount the failure of the standard ERM to satisfy uniform convergence, we define a new learning rule -- denoised ERM. We show this to be a universal learning rule for POVM and probabilistically observed concept classes, and the condition for it to satisfy uniform convergence is finite fat shattering dimension of the class. We give quantitative sample complexity upper and lower bounds for learnability in terms of finite fat-shattering dimension and a notion of approximate finite partitionability into approximately jointly measurable subsets, which allow for sample reuse. We then show that finite fat shattering dimension implies finite coverability by approximately jointly measurable subsets, leading to our matching conditions. We also show that every measurement class defined on a finite-dimensional Hilbert space is PAC learnable. We illustrate our results on several example POVM classes.
The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
Deep neural models in recent years have been successful in almost every field, including extremely complex problem statements. However, these models are huge in size, with millions (and even billions) of parameters, thus demanding more heavy computation power and failing to be deployed on edge devices. Besides, the performance boost is highly dependent on redundant labeled data. To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another. KD is often characterized by the so-called `Student-Teacher' (S-T) learning framework and has been broadly applied in model compression and knowledge transfer. This paper is about KD and S-T learning, which are being actively studied in recent years. First, we aim to provide explanations of what KD is and how/why it works. Then, we provide a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically for vision tasks. In general, we consider some fundamental questions that have been driving this research area and thoroughly generalize the research progress and technical details. Additionally, we systematically analyze the research status of KD in vision applications. Finally, we discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.