The interest in quantum computing is growing, and with it, the importance of software platforms to develop quantum programs. Ensuring the correctness of such platforms is important, and it requires a thorough understanding of the bugs they typically suffer from. To address this need, this paper presents the first in-depth study of bugs in quantum computing platforms. We gather and inspect a set of 223 real-world bugs from 18 open-source quantum computing platforms. Our study shows that a significant fraction of these bugs (39.9%) are quantum-specific, calling for dedicated approaches to prevent and find them. The bugs are spread across various components, but quantum-specific bugs occur particularly often in components that represent, compile, and optimize quantum programming abstractions. Many quantum-specific bugs manifest through unexpected outputs, rather than more obvious signs of misbehavior, such as crashes. Finally, we present a hierarchy of recurrent bug patterns, including ten novel, quantum-specific patterns. Our findings not only show the importance and prevalence bugs in quantum computing platforms, but they help developers to avoid common mistakes and tool builders to tackle the challenge of preventing, finding, and fixing these bugs.
Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.
Additive Noise Models (ANM) encode a popular functional assumption that enables learning causal structure from observational data. Due to a lack of real-world data meeting the assumptions, synthetic ANM data are often used to evaluate causal discovery algorithms. Reisach et al. (2021) show that, for common simulation parameters, a variable ordering by increasing variance is closely aligned with a causal order and introduce var-sortability to quantify the alignment. Here, we show that not only variance, but also the fraction of a variable's variance explained by all others, as captured by the coefficient of determination $R^2$, tends to increase along the causal order. Simple baseline algorithms can use $R^2$-sortability to match the performance of established methods. Since $R^2$-sortability is invariant under data rescaling, these algorithms perform equally well on standardized or rescaled data, addressing a key limitation of algorithms exploiting var-sortability. We characterize and empirically assess $R^2$-sortability for different simulation parameters. We show that all simulation parameters can affect $R^2$-sortability and must be chosen deliberately to control the difficulty of the causal discovery task and the real-world plausibility of the simulated data. We provide an implementation of the sortability measures and sortability-based algorithms in our library CausalDisco (//github.com/CausalDisco/CausalDisco).
Without writing a single line of code by a human, an example Monte Carlo simulation based application for stochastic dependence modeling with copulas is developed using a state-of-the-art large language model (LLM) fine-tuned for conversations. This includes interaction with ChatGPT in natural language and using mathematical formalism, which, under careful supervision by a human-expert, led to producing a working code in MATLAB, Python and R for sampling from a given copula model, evaluation of the model's density, performing maximum likelihood estimation, optimizing the code for parallel computing for CPUs as well as for GPUs, and visualization of the computed results. In contrast to other emerging studies that assess the accuracy of LLMs like ChatGPT on tasks from a selected area, this work rather investigates ways how to achieve a successful solution of a standard statistical task in a collaboration of a human-expert and artificial intelligence (AI). Particularly, through careful prompt engineering, we separate successful solutions generated by ChatGPT from unsuccessful ones, resulting in a comprehensive list of related pros and cons. It is demonstrated that if the typical pitfalls are avoided, we can substantially benefit from collaborating with an AI partner. For example, we show that if ChatGPT is not able to provide a correct solution due to a lack of or incorrect knowledge, the human-expert can feed it with the correct knowledge, e.g., in the form of mathematical theorems and formulas, and make it to apply the gained knowledge in order to provide a solution that is correct. Such ability presents an attractive opportunity to achieve a programmed solution even for users with rather limited knowledge of programming techniques.
To realize reliable quantum software, techniques to automatically ensure the quantum software's correctness have recently been investigated. However, they primarily focus on fixed quantum circuits rather than the procedure of building quantum circuits. Despite being a common approach, the correctness of building circuits using different parameters following the same procedure is not guaranteed. To this end, we propose a design-by-contract framework for quantum software. Our framework provides a python-embedded language to write assertions on the input and output states of all quantum circuits built by certain procedures. Additionally, it provides a method to write assertions about the statistical processing of measurement results to ensure the procedure's correctness for obtaining the final result. These assertions are automatically checked using a quantum computer simulator. For evaluation, we implemented our framework and wrote assertions for some widely used quantum algorithms. Consequently, we found that our framework has sufficient expressive power to verify the whole procedure of quantum software.
While code-mixing is a common linguistic practice in many parts of the world, collecting high-quality and low-cost code-mixed data remains a challenge for natural language processing (NLP) research. The proliferation of Large Language Models (LLMs) in recent times compels one to ask: can these systems be used for data generation? In this article, we explore prompting multilingual LLMs in a zero-shot manner to create code-mixed data for five languages in South East Asia (SEA) -- Indonesian, Malay, Chinese, Tagalog, Vietnamese, as well as the creole language Singlish. We find that ChatGPT shows the most potential, capable of producing code-mixed text 68% of the time when the term "code-mixing" is explicitly defined. Moreover, both ChatGPT's and InstructGPT's (davinci-003) performances in generating Singlish texts are noteworthy, averaging a 96% success rate across a variety of prompts. Their code-mixing proficiency, however, is dampened by word choice errors that lead to semantic inaccuracies. Other multilingual models such as BLOOMZ and Flan-T5-XXL are unable to produce code-mixed texts altogether. By highlighting the limited promises of LLMs in a specific form of low-resource data generation, we call for a measured approach when applying similar techniques to other data-scarce NLP contexts.
Globalization has introduced many new challenges making Supply chain management (SCM) complex and huge, for which improvement is needed in many industries. The Internet of Things (IoT) has solved many problems by providing security and traceability with a promising solution for supply chain management. SCM is segregated into different processes, each requiring different types of solutions. IoT devices can solve distributed system problems by creating trustful relationships. Since the whole business industry depends on the trust between different supply chain actors, IoT can provide this trust by making the entire ecosystem much more secure, reliable, and traceable. This paper will discuss how IoT technology has solved problems related to SCM in different areas. Supply chains in different industries, from pharmaceuticals to agriculture supply chain, have different issues and require different solutions. We will discuss problems such as security, tracking, traceability, and warehouse issues. All challenges faced by independent industries regarding the supply chain and how the amalgamation of IoT with other technology will be provided with solutions.
Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering and recommendation systems, etc. According to the graph types, the existing KGR models can be roughly divided into three categories, \textit{i.e.,} static models, temporal models, and multi-modal models. The early works in this domain mainly focus on static KGR and tend to directly apply general knowledge graph embedding models to the reasoning task. However, these models are not suitable for more complex but practical tasks, such as inductive static KGR, temporal KGR, and multi-modal KGR. To this end, multiple works have been developed recently, but no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the preliminaries, summaries of KGR models, and typical datasets are introduced and discussed consequently. Moreover, we discuss the challenges and potential opportunities. The corresponding open-source repository is shared on GitHub: //github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.
Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.
In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone's daily life and profoundly altering the course of human society. The intention of developing AI is to benefit humans, by reducing human labor, bringing everyday convenience to human lives, and promoting social good. However, recent research and AI applications show that AI can cause unintentional harm to humans, such as making unreliable decisions in safety-critical scenarios or undermining fairness by inadvertently discriminating against one group. Thus, trustworthy AI has attracted immense attention recently, which requires careful consideration to avoid the adverse effects that AI may bring to humans, so that humans can fully trust and live in harmony with AI technologies. Recent years have witnessed a tremendous amount of research on trustworthy AI. In this survey, we present a comprehensive survey of trustworthy AI from a computational perspective, to help readers understand the latest technologies for achieving trustworthy AI. Trustworthy AI is a large and complex area, involving various dimensions. In this work, we focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems. We also discuss the accordant and conflicting interactions among different dimensions and discuss potential aspects for trustworthy AI to investigate in the future.