Training advanced AI models requires large investments in computational resources, or compute. Yet, as hardware innovation reduces the price of compute and algorithmic advances make its use more efficient, the cost of training an AI model to a given performance falls over time - a concept we describe as increasing compute efficiency. We find that while an access effect increases the number of actors who can train models to a given performance over time, a performance effect simultaneously increases the performance available to each actor. This potentially enables large compute investors to pioneer new capabilities, maintaining a performance advantage even as capabilities diffuse. Since large compute investors tend to develop new capabilities first, it will be particularly important that they share information about their AI models, evaluate them for emerging risks, and, more generally, make responsible development and release decisions. Further, as compute efficiency increases, governments will need to prepare for a world where dangerous AI capabilities are widely available - for instance, by developing defenses against harmful AI models or by actively intervening in the diffusion of particularly dangerous capabilities.
While much work has been done recently in the realm of model-based control of soft robots and soft-rigid hybrids, most works examine robots that have an inherently serial structure. While these systems have been prevalent in the literature, there is an increasing trend toward designing soft-rigid hybrids with intrinsically coupled elasticity between various degrees of freedom. In this work, we seek to address the issues of modeling and controlling such structures, particularly when underactuated. We introduce several simple models for elastic coupling, typical of those seen in these systems. We then propose a controller that compensates for the elasticity, and we prove its stability with Lyapunov methods without relying on the elastic dominance assumption. This controller is applicable to the general class of underactuated soft robots. After evaluating the controller in simulated cases, we then develop a simple hardware platform to evaluate both the models and the controller. Finally, using the hardware, we demonstrate a novel use case for underactuated, elastically coupled systems in "sensorless" force control.
This paper makes the case that a powerful new discipline, which we term perception engineering, is steadily emerging. It follows from a progression of ideas that involve creating illusions, from historical paintings and film, to video games and virtual reality in modern times. Rather than creating physical artifacts such as bridges, airplanes, or computers, perception engineers create illusory perceptual experiences. The scope is defined over any agent that interacts with the physical world, including both biological organisms (humans, animals) and engineered systems (robots, autonomous systems). The key idea is that an agent, called a producer, alters the environment with the intent to alter the perceptual experience of another agent, called a receiver. Most importantly, the paper introduces a precise mathematical formulation of this process, based on the von Neumann-Morgenstern notion of information, to help scope and define the discipline. It is then applied to the cases of engineered and biological agents with discussion of its implications on existing fields such as virtual reality, robotics, and even social media. Finally, open challenges and opportunities for involvement are identified.
The development of deep learning architectures is a resource-demanding process, due to a vast design space, long prototyping times, and high compute costs associated with at-scale model training and evaluation. We set out to simplify this process by grounding it in an end-to-end mechanistic architecture design (MAD) pipeline, encompassing small-scale capability unit tests predictive of scaling laws. Through a suite of synthetic token manipulation tasks such as compression and recall, designed to probe capabilities, we identify and test new hybrid architectures constructed from a variety of computational primitives. We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis, training over 500 language models between 70M to 7B parameters. Surprisingly, we find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures via isolated proxy tasks. The new architectures found via MAD, based on simple ideas such as hybridization and sparsity, outperform state-of-the-art Transformer, convolutional, and recurrent architectures (Transformer++, Hyena, Mamba) in scaling, both at compute-optimal budgets and in overtrained regimes. Overall, these results provide evidence that performance on curated synthetic tasks can be predictive of scaling laws, and that an optimal architecture should leverage specialized layers via a hybrid topology.
What do artificial neural networks (ANNs) learn? The machine learning (ML) community shares the narrative that ANNs must develop abstract human concepts to perform complex tasks. Some go even further and believe that these concepts are stored in individual units of the network. Based on current research, I systematically investigate the assumptions underlying this narrative. I conclude that ANNs are indeed capable of performing complex prediction tasks, and that they may learn human and non-human concepts to do so. However, evidence indicates that ANNs do not represent these concepts in individual units.
Modern hardware designs have grown increasingly efficient and complex. However, they are often susceptible to Common Weakness Enumerations (CWEs). This paper is focused on the formal verification of CWEs in a dataset of hardware designs written in SystemVerilog from Regenerative Artificial Intelligence (AI) powered by Large Language Models (LLMs). We applied formal verification to categorize each hardware design as vulnerable or CWE-free. This dataset was generated by 4 different LLMs and features a unique set of designs for each of the 10 CWEs we target in our paper. We have associated the identified vulnerabilities with CWE numbers for a dataset of 60,000 generated SystemVerilog Register Transfer Level (RTL) code. It was also found that most LLMs are not aware of any hardware CWEs; hence they are usually not considered when generating the hardware code. Our study reveals that approximately 60% of the hardware designs generated by LLMs are prone to CWEs, posing potential safety and security risks. The dataset could be ideal for training LLMs and Machine Learning (ML) algorithms to abstain from generating CWE-prone hardware designs.
Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents. However, classic topic modelling approaches (e.g., LDA) have certain drawbacks, such as the lack of semantic understanding and the presence of overlapping topics. In this work, we investigate the untapped potential of large language models (LLMs) as an alternative for uncovering the underlying topics within extensive text corpora. To this end, we introduce a framework that prompts LLMs to generate topics from a given set of documents and establish evaluation protocols to assess the clustering efficacy of LLMs. Our findings indicate that LLMs with appropriate prompts can stand out as a viable alternative, capable of generating relevant topic titles and adhering to human guidelines to refine and merge topics. Through in-depth experiments and evaluation, we summarise the advantages and constraints of employing LLMs in topic extraction.
Autonomous systems, including generative AI, have been adopted faster than previous digital innovations. Their impact on society might as well be more profound, with a radical restructuring of the economy of knowledge and dramatic consequences for social and institutional balances. Different attitudes to control these systems have emerged rooted in the classical pillars of legal systems, proprietary rights, and social responsibility. We show how an illusion of control might be guiding governments and regulators, while autonomous systems might be driving us to inescapable delusion.
Metaverse aims to construct a large, unified, immersive, and shared digital realm by combining various technologies, namely XR (extended reality), blockchain, and digital twin, among others. This article explores the Metaverse from the perspective of multimedia communication by conducting and analyzing real-world experiments on four different Metaverse platforms: VR (virtual reality) Vircadia, VR Mozilla Hubs, VRChat, and MR (mixed reality) Virtual City. We first investigate the traffic patterns and network performance in the three VR platforms. After raising the challenges of the Metaverse streaming and investigating the potential methods to enhance Metaverse performance, we propose a remote rendering architecture and verify its advantages through a prototype involving the campus network and MR multimodal interaction by comparison with local rendering.
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.