Informative path planning algorithms are of paramount importance in applications like disaster management to efficiently gather information through a priori unknown environments. This is, however, a complex problem that involves finding a globally optimal path that gathers the maximum amount of information (e.g., the largest map with a minimum travelling distance) while using partial and uncertain local measurements. This paper addresses this problem by proposing a novel heuristic algorithm that continuously estimates the potential mapping gain for different sub-areas across the partially created map, and then uses these estimations to locally navigate the robot. Furthermore, this paper presents a novel algorithm to calculate a benchmark solution, where the map is a priori known to the planar, to evaluate the efficacy of the developed heuristic algorithm over different test scenarios. The findings indicate that the efficiency of the proposed algorithm, measured in terms of the mapped area per unit of travelling distance, ranges from 70% to 80% of the benchmark solution in various test scenarios. In essence, the algorithm demonstrates the capability to generate paths that come close to the globally optimal path provided by the benchmark solution.
We present a novel approach to cooperative aerial transportation through a team of drones, using optimal control theory and a hierarchical control strategy. We assume the drones are connected to the payload through rigid attachments, essentially transforming the whole system into a larger flying object with "thrust modules" at the attachment locations of the drones. We investigate the optimal arrangement of the thrust modules around the payload, so that the resulting system is robust to disturbances. We choose the $\mathcal{H}_2$ norm as a measure of robustness, and propose an iterative optimization routine to compute the optimal layout of the vehicles around the object. We experimentally validate our approach using four drones and comparing the disturbance rejection performances achieved by two different layouts (the optimal one and a sub-optimal one), and observe that the results match our predictions.
We propose a distributional framework for assessing socio-technical risks of foundation models with quantified statistical significance. Our approach hinges on a new statistical relative testing based on first and second order stochastic dominance of real random variables. We show that the second order statistics in this test are linked to mean-risk models commonly used in econometrics and mathematical finance to balance risk and utility when choosing between alternatives. Using this framework, we formally develop a risk-aware approach for foundation model selection given guardrails quantified by specified metrics. Inspired by portfolio optimization and selection theory in mathematical finance, we define a \emph{metrics portfolio} for each model as a means to aggregate a collection of metrics, and perform model selection based on the stochastic dominance of these portfolios. The statistical significance of our tests is backed theoretically by an asymptotic analysis via central limit theorems instantiated in practice via a bootstrap variance estimate. We use our framework to compare various large language models regarding risks related to drifting from instructions and outputting toxic content.
There are many evaluation strategies for term rewrite systems, but proving termination automatically is usually easiest for innermost rewriting. Several syntactic criteria exist when innermost termination implies full termination. We adapt these criteria to the probabilistic setting, e.g., we show when it suffices to analyze almost-sure termination (AST) w.r.t. innermost rewriting to prove full AST of probabilistic term rewrite systems (PTRSs). These criteria also apply to other notions of termination like positive AST. We implemented and evaluated our new contributions in the tool AProVE.
As large-scale training regimes have gained popularity, the use of pretrained models for downstream tasks has become common practice in machine learning. While pretraining has been shown to enhance the performance of models in practice, the transfer of robustness properties from pretraining to downstream tasks remains poorly understood. In this study, we demonstrate that the robustness of a linear predictor on downstream tasks can be constrained by the robustness of its underlying representation, regardless of the protocol used for pretraining. We prove (i) a bound on the loss that holds independent of any downstream task, as well as (ii) a criterion for robust classification in particular. We validate our theoretical results in practical applications, show how our results can be used for calibrating expectations of downstream robustness, and when our results are useful for optimal transfer learning. Taken together, our results offer an initial step towards characterizing the requirements of the representation function for reliable post-adaptation performance.
Image-based depth estimation has gained significant attention in recent research on computer vision for autonomous vehicles in intelligent transportation systems. This focus stems from its cost-effectiveness and wide range of potential applications. Unlike binocular depth estimation methods that require two fixed cameras, monocular depth estimation methods only rely on a single camera, making them highly versatile. While state-of-the-art approaches for this task leverage self-supervised learning of deep neural networks in conjunction with tasks like pose estimation and semantic segmentation, none of them have explored the combination of federated learning and self-supervision to train models using unlabeled and private data captured by autonomous vehicles. The utilization of federated learning offers notable benefits, including enhanced privacy protection, reduced network consumption, and improved resilience to connectivity issues. To address this gap, we propose FedSCDepth, a novel method that combines federated learning and deep self-supervision to enable the learning of monocular depth estimators with comparable effectiveness and superior efficiency compared to the current state-of-the-art methods. Our evaluation experiments conducted on Eigen's Split of the KITTI dataset demonstrate that our proposed method achieves near state-of-the-art performance, with a test loss below 0.13 and requiring, on average, only 1.5k training steps and up to 0.415 GB of weight data transfer per autonomous vehicle on each round.
Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.