We discuss the adequacy of tests for intelligent systems and practical problems raised by their implementation. We propose the replacement test as the ability of a system to replace successfully another system performing a task in a given context. We show how it can characterize salient aspects of human intelligence that cannot be taken into account by the Turing test. We argue that building intelligent systems passing the replacement test involves a series of technical problems that are outside the scope of current AI. We present a framework for implementing the proposed test and validating the properties of the intelligent systems. We discuss the inherent limitations of intelligent system validation and advocate new theoretical foundations for extending existing rigorous test methods. We suggest that the replacement test, based on the complementarity of skills between human and machine, can lead to a multitude of intelligence concepts reflecting the ability to combine data-based and symbolic knowledge to varying degrees.
Explainable Artificial Intelligence (XAI) plays a crucial role in enabling human understanding and trust in deep learning systems, often defined as determining which features are most important to a model's prediction. As models get larger, more ubiquitous, and pervasive in aspects of daily life, explainability is necessary to avoid or minimize adverse effects of model mistakes. Unfortunately, current approaches in human-centric XAI (e.g. predictive tasks in healthcare, education, or personalized ads) tend to rely on a single explainer. This is a particularly concerning trend when considering that recent work has identified systematic disagreement in explainability methods when applied to the same points and underlying black-box models. In this paper, we therefore present a call for action to address the limitations of current state-of-the-art explainers. We propose to shift from post-hoc explainability to designing interpretable neural network architectures; moving away from approximation techniques in human-centric and high impact applications. We identify five needs of human-centric XAI (real-time, accurate, actionable, human-interpretable, and consistent) and propose two schemes for interpretable-by-design neural network workflows (adaptive routing for interpretable conditional computation and diagnostic benchmarks for iterative model learning). We postulate that the future of human-centric XAI is neither in explaining black-boxes nor in reverting to traditional, interpretable models, but in neural networks that are intrinsically interpretable.
This work reviews the human auditory system, elucidating some of the specialized mechanisms and non-linear pathways along the chain of events between physical sound and its perception. Customary relationships between frequency, time, and phase--such as the uncertainty principle--that hold for linear systems, do not apply straightforwardly to the hearing process. Auditory temporal resolution for certain processes can be a hundredth of the period of the signal, and can extend down to the microseconds time scale. The astonishingly large number of variations that correspond to the neural excitation pattern of 30000 auditory nerve fibers, originating from 3500 inner hair cells, explicates the vast capacity of the auditory system for the resolution of sonic detail. And the ear is sensitive enough to detect a basilar-membrane amplitude at the level of a picometer, or about a hundred times smaller than an atom. This article surveys and provides new insights into some of the impressive capabilities of the human auditory system and explores their relationship to fidelity in reproduced sound.
Multi-agent reinforcement learning (MARL) is a widely used Artificial Intelligence (AI) technique. However, current studies and applications need to address its scalability, non-stationarity, and trustworthiness. This paper aims to review methods and applications and point out research trends and visionary prospects for the next decade. First, this paper summarizes the basic methods and application scenarios of MARL. Second, this paper outlines the corresponding research methods and their limitations on safety, robustness, generalization, and ethical constraints that need to be addressed in the practical applications of MARL. In particular, we believe that trustworthy MARL will become a hot research topic in the next decade. In addition, we suggest that considering human interaction is essential for the practical application of MARL in various societies. Therefore, this paper also analyzes the challenges while MARL is applied to human-machine interaction.
Automated Driving Systems (ADS) have made great achievements in recent years thanks to the efforts from both academia and industry. A typical ADS is composed of multiple modules, including sensing, perception, planning and control, which brings together the latest advances in multiple domains. Despite these achievements, safety assurance of the systems is still of great significance, since the unsafe behavior of ADS can bring catastrophic consequences and unacceptable economic and social losses. Testing is an important approach to system validation for the deployment in practice; in the context of ADS, it is extremely challenging, due to the system complexity and multidisciplinarity. There has been a great deal of literature that focuses on the testing of ADS, and a number of surveys have also emerged to summarize the technical advances. However, most of these surveys focus on the system-level testing that is performed within software simulators, and thereby ignore the distinct features of individual modules. In this paper, we provide a comprehensive survey on the existing ADS testing literature, which takes into account both module-level and system-level testing. Specifically, we make the following contributions: (1) we build a threat model that reveals the potential safety threats for each module of an ADS; (2) we survey the module-level testing techniques for ADS and highlight the technical differences affected by the properties of the modules; (3) we also survey the system-level testing techniques, but we focus on empirical studies that take a bird's-eye view on the system, the problems due to the collaborations between modules, and the gaps between ADS testing in simulators and real world; (4) we identify the challenges and opportunities in ADS testing, which facilitates the future research in this field.
Along with the massive growth of the Internet from the 1990s until now, various innovative technologies have been created to bring users breathtaking experiences with more virtual interactions in cyberspace. Many virtual environments with thousands of services and applications, from social networks to virtual gaming worlds, have been developed with immersive experience and digital transformation, but most are incoherent instead of being integrated into a platform. In this context, metaverse, a term formed by combining meta and universe, has been introduced as a shared virtual world that is fueled by many emerging technologies, such as fifth-generation networks and beyond, virtual reality, and artificial intelligence (AI). Among such technologies, AI has shown the great importance of processing big data to enhance immersive experience and enable human-like intelligence of virtual agents. In this survey, we make a beneficial effort to explore the role of AI in the foundation and development of the metaverse. We first deliver a preliminary of AI, including machine learning algorithms and deep learning architectures, and its role in the metaverse. We then convey a comprehensive investigation of AI-based methods concerning six technical aspects that have potentials for the metaverse: natural language processing, machine vision, blockchain, networking, digital twin, and neural interface, and being potential for the metaverse. Subsequently, several AI-aided applications, such as healthcare, manufacturing, smart cities, and gaming, are studied to be deployed in the virtual worlds. Finally, we conclude the key contribution of this survey and open some future research directions in AI for the metaverse.
The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents. However, the success of RL agents is often highly sensitive to design choices in the training process, which may require tedious and error-prone manual tuning. This makes it challenging to use RL for new problems, while also limits its full potential. In many other areas of machine learning, AutoML has shown it is possible to automate such design choices and has also yielded promising initial results when applied to RL. However, Automated Reinforcement Learning (AutoRL) involves not only standard applications of AutoML but also includes additional challenges unique to RL, that naturally produce a different set of methods. As such, AutoRL has been emerging as an important area of research in RL, providing promise in a variety of applications from RNA design to playing games such as Go. Given the diversity of methods and environments considered in RL, much of the research has been conducted in distinct subfields, ranging from meta-learning to evolution. In this survey we seek to unify the field of AutoRL, we provide a common taxonomy, discuss each area in detail and pose open problems which would be of interest to researchers going forward.
Games and simulators can be a valuable platform to execute complex multi-agent, multiplayer, imperfect information scenarios with significant parallels to military applications: multiple participants manage resources and make decisions that command assets to secure specific areas of a map or neutralize opposing forces. These characteristics have attracted the artificial intelligence (AI) community by supporting development of algorithms with complex benchmarks and the capability to rapidly iterate over new ideas. The success of artificial intelligence algorithms in real-time strategy games such as StarCraft II have also attracted the attention of the military research community aiming to explore similar techniques in military counterpart scenarios. Aiming to bridge the connection between games and military applications, this work discusses past and current efforts on how games and simulators, together with the artificial intelligence algorithms, have been adapted to simulate certain aspects of military missions and how they might impact the future battlefield. This paper also investigates how advances in virtual reality and visual augmentation systems open new possibilities in human interfaces with gaming platforms and their military parallels.
Artificial Intelligence (AI) is rapidly becoming integrated into military Command and Control (C2) systems as a strategic priority for many defence forces. The successful implementation of AI is promising to herald a significant leap in C2 agility through automation. However, realistic expectations need to be set on what AI can achieve in the foreseeable future. This paper will argue that AI could lead to a fragility trap, whereby the delegation of C2 functions to an AI could increase the fragility of C2, resulting in catastrophic strategic failures. This calls for a new framework for AI in C2 to avoid this trap. We will argue that antifragility along with agility should form the core design principles for AI-enabled C2 systems. This duality is termed Agile, Antifragile, AI-Enabled Command and Control (A3IC2). An A3IC2 system continuously improves its capacity to perform in the face of shocks and surprises through overcompensation from feedback during the C2 decision-making cycle. An A3IC2 system will not only be able to survive within a complex operational environment, it will also thrive, benefiting from the inevitable shocks and volatility of war.
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.
Deep learning has revolutionized speech recognition, image recognition, and natural language processing since 2010, each involving a single modality in the input signal. However, many applications in artificial intelligence involve more than one modality. It is therefore of broad interest to study the more difficult and complex problem of modeling and learning across multiple modalities. In this paper, a technical review of the models and learning methods for multimodal intelligence is provided. The main focus is the combination of vision and natural language, which has become an important area in both computer vision and natural language processing research communities. This review provides a comprehensive analysis of recent work on multimodal deep learning from three new angles - learning multimodal representations, the fusion of multimodal signals at various levels, and multimodal applications. On multimodal representation learning, we review the key concept of embedding, which unifies the multimodal signals into the same vector space and thus enables cross-modality signal processing. We also review the properties of the many types of embedding constructed and learned for general downstream tasks. On multimodal fusion, this review focuses on special architectures for the integration of the representation of unimodal signals for a particular task. On applications, selected areas of a broad interest in current literature are covered, including caption generation, text-to-image generation, and visual question answering. We believe this review can facilitate future studies in the emerging field of multimodal intelligence for the community.