Soft robotics aims to develop robots able to adapt their behavior across a wide range of unstructured and unknown environments. A critical challenge of soft robotic control is that nonlinear dynamics often result in complex behaviors hard to model and predict. Typically behaviors for mobile soft robots are discovered through empirical trial and error and hand-tuning. More recently, optimization algorithms such as Genetic Algorithms (GA) have been used to discover gaits, but these behaviors are often optimized for a single environment or terrain, and can be brittle to unplanned changes to terrain. In this paper we demonstrate how Quality Diversity Algorithms, which search of a range of high-performing behaviors, can produce repertoires of gaits that are robust to changing terrains. This robustness significantly out-performs that of gaits produced by a single objective optimization algorithm.
We introduce HuTuMotion, an innovative approach for generating natural human motions that navigates latent motion diffusion models by leveraging few-shot human feedback. Unlike existing approaches that sample latent variables from a standard normal prior distribution, our method adapts the prior distribution to better suit the characteristics of the data, as indicated by human feedback, thus enhancing the quality of motion generation. Furthermore, our findings reveal that utilizing few-shot feedback can yield performance levels on par with those attained through extensive human feedback. This discovery emphasizes the potential and efficiency of incorporating few-shot human-guided optimization within latent diffusion models for personalized and style-aware human motion generation applications. The experimental results show the significantly superior performance of our method over existing state-of-the-art approaches.
Cybersecurity challenges and the need for awareness are well-recognized in developed countries, but this still needs attention in less-developed countries. With the expansion of technology, security concerns are also becoming more prevalent worldwide. This paper presents a design and creation research study exploring which factors we should consider when designing cybersecurity awareness solutions for young people in developing countries. We have developed prototypes of mini-cybersecurity awareness applications and conducted a pilot study with eight participants (aged 16-30) from Gambia, Eritrea, and Syria. Our findings show that factors like the influence of culture and social constructs, literacy, and language competence, the way of introducing cybersecurity terms and concepts, and the need for reflection are essential to consider when designing and developing cybersecurity awareness solutions for target users in developing countries. The findings of this study will guide future researchers to design more inclusive cybersecurity awareness solutions for users in developing countries.
Multi-agent and multi-robot systems (MRS) often rely on direct communication for information sharing. This work explores an alternative approach inspired by eavesdropping mechanisms in nature that involves casual observation of agent interactions to enhance decentralized knowledge dissemination. We achieve this through a novel IKT-BT framework tailored for a behavior-based MRS, encapsulating knowledge and control actions in Behavior Trees (BT). We present two new BT-based modalities - eavesdrop-update (EU) and eavesdrop-buffer-update (EBU) - incorporating unique eavesdropping strategies and efficient episodic memory management suited for resource-limited swarm robots. We theoretically analyze the IKT-BT framework for an MRS and validate the performance of the proposed modalities through extensive experiments simulating a search and rescue mission. Our results reveal improvements in both global mission performance outcomes and agent-level knowledge dissemination with a reduced need for direct communication.
Robust locomotion control depends on accurate state estimations. However, the sensors of most legged robots can only provide partial and noisy observations, making the estimation particularly challenging, especially for external states like terrain frictions and elevation maps. Inspired by the classical Internal Model Control principle, we consider these external states as disturbances and introduce Hybrid Internal Model (HIM) to estimate them according to the response of the robot. The response, which we refer to as the hybrid internal embedding, contains the robot's explicit velocity and implicit stability representation, corresponding to two primary goals for locomotion tasks: explicitly tracking velocity and implicitly maintaining stability. We use contrastive learning to optimize the embedding to be close to the robot's successor state, in which the response is naturally embedded. HIM has several appealing benefits: It only needs the robot's proprioceptions, i.e., those from joint encoders and IMU as observations. It innovatively maintains consistent observations between simulation reference and reality that avoids information loss in mimicking learning. It exploits batch-level information that is more robust to noises and keeps better sample efficiency. It only requires 1 hour of training on an RTX 4090 to enable a quadruped robot to traverse any terrain under any disturbances. A wealth of real-world experiments demonstrates its agility, even in high-difficulty tasks and cases never occurred during the training process, revealing remarkable open-world generalizability.
Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.
Nowadays, robots are applied in dynamic environments. For a robust operation, the motion planning module must consider other tasks besides reaching a specified pose: (self) collision avoidance, joint limit avoidance, keeping an advantageous configuration, etc. Each task demands different joint control commands, which may counteract each other. We present a hierarchical control that, depending on the robot and environment state, determines online a suitable priority among those tasks. Thereby, the control command of a lower-prioritized task never hinders the control command of a higher-prioritized task. We ensure smooth control signals also during priority rearrangement. Our hierarchical control computes reference joint velocities. However, the underlying concepts of hierarchical control differ when using joint accelerations or joint torques as control signals instead. So, as a further contribution, we provide a comprehensive discussion on how joint velocity control, joint acceleration control, and joint torque control differ in hierarchical task control. We validate our formulation in an experiment on hardware.
Open-loop stable limit cycles are foundational to the dynamics of legged robots. They impart a self-stabilizing character to the robot's gait, thus alleviating the need for compute-heavy feedback-based gait correction. This paper proposes a general approach to rapidly generate limit cycles with explicit stability constraints for a given dynamical system. In particular, we pose the problem of open-loop limit cycle stability as a single-stage constrained-optimization problem (COP), and use Direct Collocation to transcribe it into a nonlinear program (NLP) with closed-form expressions for constraints, objectives, and their gradients. The COP formulations of stability are developed based (1) on the spectral radius of a discrete return map, and (2) on the spectral radius of the system's monodromy matrix, where the spectral radius is bounded using different constraint-satisfaction formulations of the eigenvalue problem. We compare the performance and solution qualities of each approach, but specifically highlight the Schur decomposition of the monodromy matrix as a formulation which boasts wider applicability through weaker assumptions and attractive numerical convergence properties. Moreover, we present results from our experiments on a spring-loaded inverted pendulum model of a robot, where our method generated actuation trajectories for open-loop stable hopping in under 2 seconds (on the Intel Core i7-6700K), and produced energy-minimizing actuation trajectories even under tight stability constraints.
Autonomous robots need to be able to handle uncertainties when deployed in the real world. For the robot to be able to robustly work in such an environment, it needs to be able to adapt both its architecture as well as its task plan. Architecture adaptation and task plan adaptation are mutually dependent, and therefore require the system to apply runtime architecture and task plan co-adaptation. This work presents Metaplan, which makes use of models of the robot and its environment, together with a PDDL planner to apply runtime architecture and task plan co-adaptation. Metaplan is designed to be easily reusable across different domains. Metaplan is shown to successfully perform runtime architecture and task plan co-adaptation with a self-adaptive unmanned underwater vehicle exemplar, and its reusability is demonstrated by applying it to an unmanned ground vehicle.
Spiking neural networks (SNNs) are recurrent models that can leverage sparsity in input time series to efficiently carry out tasks such as classification. Additional efficiency gains can be obtained if decisions are taken as early as possible as a function of the complexity of the input time series. The decision on when to stop inference and produce a decision must rely on an estimate of the current accuracy of the decision. Prior work demonstrated the use of conformal prediction (CP) as a principled way to quantify uncertainty and support adaptive-latency decisions in SNNs. In this paper, we propose to enhance the uncertainty quantification capabilities of SNNs by implementing ensemble models for the purpose of improving the reliability of stopping decisions. Intuitively, an ensemble of multiple models can decide when to stop more reliably by selecting times at which most models agree that the current accuracy level is sufficient. The proposed method relies on different forms of information pooling from ensemble models, and offers theoretical reliability guarantees. We specifically show that variational inference-based ensembles with p-variable pooling significantly reduce the average latency of state-of-the-art methods, while maintaining reliability guarantees.
Building general-purpose robots that can operate seamlessly, in any environment, with any object, and utilizing various skills to complete diverse tasks has been a long-standing goal in Artificial Intelligence. Unfortunately, however, most existing robotic systems have been constrained - having been designed for specific tasks, trained on specific datasets, and deployed within specific environments. These systems usually require extensively-labeled data, rely on task-specific models, have numerous generalization issues when deployed in real-world scenarios, and struggle to remain robust to distribution shifts. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i.e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like. We begin by providing an overview of what constitutes a conventional robotic system and the fundamental barriers to making it universally applicable. Next, we establish a taxonomy to discuss current work exploring ways to leverage existing foundation models for robotics and develop ones catered to robotics. Finally, we discuss key challenges and promising future directions in using foundation models for enabling general-purpose robotic systems. We encourage readers to view our living GitHub repository of resources, including papers reviewed in this survey as well as related projects and repositories for developing foundation models for robotics.