Simultaneous localisation and mapping (SLAM) algorithms are commonly used in robotic systems for learning maps of novel environments. Brains also appear to learn maps, but the mechanisms are not known and it is unclear how to infer these maps from neural activity data. We present BrainSLAM; a method for performing SLAM using only population activity (local field potential, LFP) data simultaneously recorded from three brain regions in rats: hippocampus, prefrontal cortex, and parietal cortex. This system uses a convolutional neural network (CNN) to decode velocity and familiarity information from wavelet scalograms of neural local field potential data recorded from rats as they navigate a 2D maze. The CNN's output drives a RatSLAM-inspired architecture, powering an attractor network which performs path integration plus a separate system which performs `loop closure' (detecting previously visited locations and correcting map aliasing errors). Together, these three components can construct faithful representations of the environment while simultaneously tracking the animal's location. This is the first demonstration of inference of a spatial map from brain recordings. Our findings expand SLAM to a new modality, enabling a new method of mapping environments and facilitating a better understanding of the role of cognitive maps in navigation and decision making.
This paper explores the problem of planning for visual search without prior map information. We leverage the pixel-wise environment perception problem where one is given wide Field of View 2D scan data and must perform LiDAR segmentation to contextually label points in the surroundings. These pixel classifications provide an informed prior on which to plan next best viewpoints during visual search tasks. We present LIVES: LiDAR Informed Visual Search, a method aimed at finding objects of interest in unknown indoor environments. A map-generalizable classifier is trained from expert-data collected using a simple cart platform equipped with a map-based classifier. An autonomous exploration planner takes the contextual data from scans and uses that prior to plan viewpoints more likely to yield detection of the search target. In order to achieve this, we propose a utility function that accounts for traditional metrics like information gain and path cost and also for the additional contextual information from the scan classifier. LIVES is baselined against several existing exploration methods in simulation to verify its performance. Finally, it is validated in real-world experiments searching for single and multiple targets with a Spot robot in two unseen environments. Videos of experimental validation, implementation details and open source code can be found on our project website at //sites.google.com/view/lives-2024/home.
Deep reinforcement learning (DRL) algorithms require substantial samples and computational resources to achieve higher performance, which restricts their practical application and poses challenges for further development. Given the constraint of limited resources, it is essential to leverage existing computational work (e.g., learned policies, samples) to enhance sample efficiency and reduce the computational resource consumption of DRL algorithms. Previous works to leverage existing computational work require intrusive modifications to existing algorithms and models, designed specifically for specific algorithms, lacking flexibility and universality. In this paper, we present the Snapshot Reinforcement Learning (SnapshotRL) framework, which enhances sample efficiency by simply altering environments, without making any modifications to algorithms and models. By allowing student agents to choose states in teacher trajectories as the initial state to sample, SnapshotRL can effectively utilize teacher trajectories to assist student agents in training, allowing student agents to explore a larger state space at the early training phase. We propose a simple and effective SnapshotRL baseline algorithm, S3RL, which integrates well with existing DRL algorithms. Our experiments demonstrate that integrating S3RL with TD3, SAC, and PPO algorithms on the MuJoCo benchmark significantly improves sample efficiency and average return, without extra samples and additional computational resources.
Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this paper, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly-available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs.
Quantum algorithms for tasks such as factorization, search, and simulation rely on control flow such as branching and iteration that depends on the value of data in superposition. High-level programming abstractions for control flow, such as switches, loops, and higher-order functions, are ubiquitous in classical languages. By contrast, many quantum languages do not provide high-level abstractions for control flow in superposition, and instead require the use of hardware-level logic gates to implement such control flow. The reason for this gap is that whereas a classical computer supports control flow using a program counter that can depend on data, the typical architecture of a quantum computer does not provide a program counter that can depend on data in superposition. As a result, the complete set of control flow abstractions that can be correctly realized on a quantum computer has not yet been established. In this work, we provide a complete characterization of the properties of control flow abstractions that are correctly realizable on a quantum computer. First, we prove that even on a quantum computer whose program counter exists in superposition, one cannot correctly realize control flow in quantum algorithms by lifting the classical conditional jump instruction to work in superposition. This theorem denies the ability to directly lift general abstractions for control flow such as the $\lambda$-calculus from classical to quantum programming. In response, we present the necessary and sufficient conditions for control flow to be correctly realizable on a quantum computer. We introduce the quantum control machine, an instruction set architecture featuring a conditional jump that is restricted to satisfy these conditions. We show how this design enables a developer to correctly express control flow in quantum algorithms using a program counter in place of logic gates.
We present a randomized, inverse-free algorithm for producing an approximate diagonalization of any $n \times n$ matrix pencil $(A,B)$. The bulk of the algorithm rests on a randomized divide-and-conquer eigensolver for the generalized eigenvalue problem originally proposed by Ballard, Demmel, and Dumitriu [Technical Report 2010]. We demonstrate that this divide-and-conquer approach can be formulated to succeed with high probability provided the input pencil is sufficiently well-behaved, which is accomplished by generalizing the recent pseudospectral shattering work of Banks, Garza-Vargas, Kulkarni, and Srivastava [Foundations of Computational Mathematics 2022]. In particular, we show that perturbing and scaling $(A,B)$ regularizes its pseudospectra, allowing divide-and-conquer to run over a simple random grid and in turn producing an accurate diagonalization of $(A,B)$ in the backward error sense. The main result of the paper states the existence of a randomized algorithm that with high probability (and in exact arithmetic) produces invertible $S,T$ and diagonal $D$ such that $||A - SDT^{-1}||_2 \leq \varepsilon$ and $||B - ST^{-1}||_2 \leq \varepsilon$ in at most $O \left(\log^2 \left( \frac{n}{\varepsilon} \right) T_{\text{MM}}(n) \right)$ operations, where $T_{\text{MM}}(n)$ is the asymptotic complexity of matrix multiplication. This not only provides a new set of guarantees for highly parallel generalized eigenvalue solvers but also establishes nearly matrix multiplication time as an upper bound on the complexity of inverse-free, exact arithmetic matrix pencil diagonalization.
A novel near-field transmission framework is proposed for dynamic metasurface antenna (DMA)-enabled non-orthogonal multiple access (NOMA) networks. The base station (BS) exploits the hybrid beamforming to communicate with multiple near users (NUs) and far users (FUs) using the NOMA principle. Based on this framework, two novel beamforming schemes are proposed. 1) For the case of the grouped users distributed in the same direction, a beam-steering scheme is developed. The metric of beam pattern error (BPE) is introduced for the characterization of the gap between the hybrid beamformers and the desired ideal beamformers, where a two-layer algorithm is proposed to minimize BPE by optimizing hybrid beamformers. Then, the optimal power allocation strategy is obtained to maximize the sum achievable rate of the network. 2) For the case of users randomly distributed, a beam-splitting scheme is proposed, where two sub-beamformers are extracted from the single beamformer to serve different users in the same group. An alternating optimization (AO) algorithm is proposed for hybrid beamformer optimization, and the optimal power allocation is also derived. Numerical results validate that: 1) the proposed beamforming schemes exhibit superior performance compared with the existing imperfect-resolution-based beamforming scheme; 2) the communication rate of the proposed transmission framework is sensitive to the imperfect distance knowledge of NUs but not to that of FUs.
Deep learning-based algorithms have seen a massive popularity in different areas of remote sensing image analysis over the past decade. Recently, transformers-based architectures, originally introduced in natural language processing, have pervaded computer vision field where the self-attention mechanism has been utilized as a replacement to the popular convolution operator for capturing long-range dependencies. Inspired by recent advances in computer vision, remote sensing community has also witnessed an increased exploration of vision transformers for a diverse set of tasks. Although a number of surveys have focused on transformers in computer vision in general, to the best of our knowledge we are the first to present a systematic review of recent advances based on transformers in remote sensing. Our survey covers more than 60 recent transformers-based methods for different remote sensing problems in sub-areas of remote sensing: very high-resolution (VHR), hyperspectral (HSI) and synthetic aperture radar (SAR) imagery. We conclude the survey by discussing different challenges and open issues of transformers in remote sensing. Additionally, we intend to frequently update and maintain the latest transformers in remote sensing papers with their respective code at: //github.com/VIROBO-15/Transformer-in-Remote-Sensing
Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This allows one to reason about the effects of changes to this process (i.e., interventions) and what would have happened in hindsight (i.e., counterfactuals). We categorize work in \causalml into five groups according to the problems they tackle: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, (5) causal reinforcement learning. For each category, we systematically compare its methods and point out open problems. Further, we review modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.
The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.