Dreams of machines that rival human intelligence have shaped the field of AI since its inception. Yet there remains no agreed-upon conception of what human-level AI or artificial general intelligence (AGI) means. We investigate key social, political, and ethical assumptions made by influential conceptions of AGI and human-level AI. We then draw on feminist, STS, and social science scholarship on the political and social character of intelligence in both humans and machines to defend a pluralistic, democratic, and participatory conception of the topic. We argue that framing AGI or human-level AI as a technical or value-neutral topic leads to political, ethical, and epistemic harm. AGI should not be developed without explicit attention to the values they encode, the people they include or exclude, and a view toward epistemic justice.
Unfair behaviors of Machine Learning (ML) software have garnered increasing attention and concern among software engineers. To tackle this issue, extensive research has been dedicated to conducting fairness testing of ML software, and this paper offers a comprehensive survey of existing studies in this field. We collect 100 papers and organize them based on the testing workflow (i.e., how to test) and testing components (i.e., what to test). Furthermore, we analyze the research focus, trends, and promising directions in the realm of fairness testing. We also identify widely-adopted datasets and open-source tools for fairness testing.
Large language models (LLMs) are recognized as systems that closely mimic aspects of human intelligence. This capability has attracted attention from the social science community, who see the potential in leveraging LLMs to replace human participants in experiments, thereby reducing research costs and complexity. In this paper, we introduce a framework for large language models personification, including a strategy for constructing virtual characters' life stories from the ground up, a Multi-Agent Cognitive Mechanism capable of simulating human cognitive processes, and a psychology-guided evaluation method to assess human simulations from both self and observational perspectives. Experimental results demonstrate that our constructed simulacra can produce personified responses that align with their target characters. Our work is a preliminary exploration which offers great potential in practical applications. All the code and datasets will be released, with the hope of inspiring further investigations.
Quantum computing promises to help humanity solve problems that would otherwise be intractable on classical computers. Unlike today's machines, quantum computers use a novel computing process that leverages the foundational quantum mechanical laws of nature. This unlocks unparalleled compute power for certain applications and promises to help solve some of our generation's gravest challenges, including the climate crisis, food insecurity, and widespread disease. No one entity will be able to realize this end state alone. Developing a fault-tolerant quantum supercomputer and a vibrant ecosystem around it will require deep partnerships between industry, governments, and academia. It will also require collective action to enable and promote positive applications of quantum computing and ensure that the safe and responsible use of the technology is at the center of its development and deployment. Achieving these objectives will require focusing on three priorities: 1. Impact. Ensure quantum computing benefits all of humankind by developing quantum solutions to solve critical, global problems. 2. Use. Protect against malicious use by accelerating the deployment of quantum-safe cryptography and developing governance processes and controls for the responsible use of quantum machines. 3. Access. Democratize the potential for economic growth across all of society through skilling, workforce and ecosystem development, and digital infrastructure. This paper discusses each in turn.
Free-proxies have been widespread since the early days of the Web, helping users bypass geo-blocked content and conceal their IP addresses. Various proxy providers promise faster Internet or increased privacy while advertising their lists comprised of hundreds of readily available free proxies. However, while paid proxy services advertise the support of encrypted connections and high stability, free proxies often lack such guarantees, making them prone to malicious activities such as eavesdropping or modifying content. Furthermore, there is a market that encourages exploiting devices to install proxies. In this paper, we present a 30-month longitudinal study analyzing the stability, security, and potential manipulation of free web proxies that we collected from 11 providers. Our collection resulted in over 640,600 proxies, that we cumulatively tested daily. We find that only 34.5% of proxies were active at least once during our tests, showcasing the general instability of free proxies. Geographically, a majority of proxies originate from the US and China. Leveraging the Shodan search engine, we identified 4,452 distinct vulnerabilities on the proxies' IP addresses, including 1,755 vulnerabilities that allow unauthorized remote code execution and 2,036 that enable privilege escalation on the host device. Through the software analysis on the proxies' IP addresses, we find that 42,206 of them appear to run on MikroTik routers. Worryingly, we also discovered 16,923 proxies that manipulate content, indicating potential malicious intent by proxy owners. Ultimately, our research reveals that the use of free web proxies poses significant risks to users' privacy and security. The instability, vulnerabilities, and potential for malicious actions uncovered in our analysis lead us to strongly caution users against relying on free proxies.
The design space of current quantum computers is expansive with no obvious winning solution. This leaves practitioners with a clear question: "What is the optimal system configuration to run an algorithm?". This paper explores hardware design trade-offs across NISQ systems to guide algorithm and hardware design choices. The evaluation is driven by algorithmic workloads and algorithm fidelity models which capture architectural features such as gate expressivity, fidelity, and crosstalk. We also argue that the criteria for gate design and selection should be extended from maximizing average fidelity to a more comprehensive approach that takes into account the gate expressivity with respect to algorithmic structures. We consider native entangling gates (CNOT, ECR, CZ, ZZ, XX, Sycamore, $\sqrt{\text{iSWAP}}$), proposed gates (B Gate, $\sqrt[4]{\text{CNOT}}$, $\sqrt[8]{\text{CNOT}}$), as well as parameterized gates (FSim, XY). Our methodology is driven by a custom synthesis driven circuit compilation workflow, which is able to produce minimal circuit representations for a given system configuration. By providing a method to evaluate the suitability of algorithms for hardware platforms, this work emphasizes the importance of hardware-software co-design for quantum computing.
Chain-of-thought reasoning, a cognitive process fundamental to human intelligence, has garnered significant attention in the realm of artificial intelligence and natural language processing. However, there still remains a lack of a comprehensive survey for this arena. To this end, we take the first step and present a thorough survey of this research field carefully and widely. We use X-of-Thought to refer to Chain-of-Thought in a broad sense. In detail, we systematically organize the current research according to the taxonomies of methods, including XoT construction, XoT structure variants, and enhanced XoT. Additionally, we describe XoT with frontier applications, covering planning, tool use, and distillation. Furthermore, we address challenges and discuss some future directions, including faithfulness, multi-modal, and theory. We hope this survey serves as a valuable resource for researchers seeking to innovate within the domain of chain-of-thought reasoning.
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.
Machine reading comprehension (MRC) aims to teach machines to read and comprehend human languages, which is a long-standing goal of natural language processing (NLP). With the burst of deep neural networks and the evolution of contextualized language models (CLMs), the research of MRC has experienced two significant breakthroughs. MRC and CLM, as a phenomenon, have a great impact on the NLP community. In this survey, we provide a comprehensive and comparative review on MRC covering overall research topics about 1) the origin and development of MRC and CLM, with a particular focus on the role of CLMs; 2) the impact of MRC and CLM to the NLP community; 3) the definition, datasets, and evaluation of MRC; 4) general MRC architecture and technical methods in the view of two-stage Encoder-Decoder solving architecture from the insights of the cognitive process of humans; 5) previous highlights, emerging topics, and our empirical analysis, among which we especially focus on what works in different periods of MRC researches. We propose a full-view categorization and new taxonomies on these topics. The primary views we have arrived at are that 1) MRC boosts the progress from language processing to understanding; 2) the rapid improvement of MRC systems greatly benefits from the development of CLMs; 3) the theme of MRC is gradually moving from shallow text matching to cognitive reasoning.
Commonsense knowledge and commonsense reasoning are some of the main bottlenecks in machine intelligence. In the NLP community, many benchmark datasets and tasks have been created to address commonsense reasoning for language understanding. These tasks are designed to assess machines' ability to acquire and learn commonsense knowledge in order to reason and understand natural language text. As these tasks become instrumental and a driving force for commonsense research, this paper aims to provide an overview of existing tasks and benchmarks, knowledge resources, and learning and inference approaches toward commonsense reasoning for natural language understanding. Through this, our goal is to support a better understanding of the state of the art, its limitations, and future challenges.
Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets.