We automate soft robotic hand design iteration by co-optimizing design and control policy for dexterous manipulation skills in simulation. Our design iteration pipeline combines genetic algorithms and policy transfer to learn control policies for nearly 400 hand designs, testing grasp quality under external force disturbances. We validate the optimized designs in the real world through teleoperation of pickup and reorient manipulation tasks. Our real world evaluation, from over 900 teleoperated tasks, shows that the trend in design performance in simulation resembles that of the real world. Furthermore, we show that optimized hand designs from our approach outperform existing soft robot hands from prior work in the real world. The results highlight the usefulness of simulation in guiding parameter choices for anthropomorphic soft robotic hand systems, and the effectiveness of our automated design iteration approach, despite the sim-to-real gap.
As autonomous driving technology progresses, the need for precise trajectory prediction models becomes paramount. This paper introduces an innovative model that infuses cognitive insights into trajectory prediction, focusing on perceived safety and dynamic decision-making. Distinct from traditional approaches, our model excels in analyzing interactions and behavior patterns in mixed autonomy traffic scenarios. It represents a significant leap forward, achieving marked performance improvements on several key datasets. Specifically, it surpasses existing benchmarks with gains of 16.2% on the Next Generation Simulation (NGSIM), 27.4% on the Highway Drone (HighD), and 19.8% on the Macao Connected Autonomous Driving (MoCAD) dataset. Our proposed model shows exceptional proficiency in handling corner cases, essential for real-world applications. Moreover, its robustness is evident in scenarios with missing or limited data, outperforming most of the state-of-the-art baselines. This adaptability and resilience position our model as a viable tool for real-world autonomous driving systems, heralding a new standard in vehicle trajectory prediction for enhanced safety and efficiency.
We propose hardware-oriented models of intrinsic plasticity (IP) and synaptic plasticity (SP) for spiking randomly connected recursive neural network (RNN). Although the potential of RNNs for temporal data processing has been demonstrated, randomness of the network architecture often causes performance degradation. Self-organization mechanism using IP and SP can mitigate the degradation, therefore, we compile these functions in a spiking neuronal model. To implement the function of IP, a variable firing threshold is introduced to each excitatory neuron in the RNN that changes stepwise in accordance with its activity. We also define other thresholds for SP that synchronize with the firing threshold, which determine the direction of stepwise synaptic update that is executed on receiving a pre-synaptic spike. We demonstrate the effectiveness of our model through simulations of temporal data learning and anomaly detection with a spiking RNN using publicly available electrocardiograms. Considering hardware implementation, we employ discretized thresholds and synaptic weights and show that these parameters can be reduced to binary if the RNN architecture is appropriately designed. This contributes to minimization of the circuit of the neuronal system having IP and SP.
This work initiates the study of a beyond-diagonal reconfigurable intelligent surface (BD-RIS)-aided transmitter architecture for integrated sensing and communication (ISAC) in the millimeter-wave (mmWave) frequency band. Deploying BD-RIS at the transmitter side not only alleviates the need for extensive fully digital radio frequency (RF) chains but also enhances both communication and sensing performance. These benefits are facilitated by the additional design flexibility introduced by the fully-connected scattering matrix of BD-RIS. To achieve the aforementioned benefits, in this work, we propose an efficient two-stage algorithm to design the digital beamforming of the transmitter and the scattering matrix of the BD-RIS with the aim of jointly maximizing the sum rate for multiple communication users and minimizing the largest eigenvalue of the Cramer-Rao bound (CRB) matrix for multiple sensing targets. Numerical results show that the transmitter-side BD-RIS-aided mmWave ISAC outperforms the conventional diagonal-RIS-aided ones in both communication and sensing performance.
Despite significant progress in robotic systems for operation within human-centric environments, existing models still heavily rely on explicit human commands to identify and manipulate specific objects. This limits their effectiveness in environments where understanding and acting on implicit human intentions are crucial. In this study, we introduce a novel task: reasoning grasping, where robots need to generate grasp poses based on indirect verbal instructions or intentions. To accomplish this, we propose an end-to-end reasoning grasping model that integrates a multi-modal Large Language Model (LLM) with a vision-based robotic grasping framework. In addition, we present the first reasoning grasping benchmark dataset generated from the GraspNet-1 billion, incorporating implicit instructions for object-level and part-level grasping, and this dataset will soon be available for public access. Our results show that directly integrating CLIP or LLaVA with the grasp detection model performs poorly on the challenging reasoning grasping tasks, while our proposed model demonstrates significantly enhanced performance both in the reasoning grasping benchmark and real-world experiments.
Multi-modal foundation models such as CLIP have showcased impressive zero-shot capabilities. However, their applicability in resource-constrained environments is limited due to their large number of parameters and high inference time. While existing approaches have scaled down the entire CLIP architecture, we focus on training smaller variants of the image encoder, which suffices for efficient zero-shot classification. The use of synthetic data has shown promise in distilling representations from larger teachers, resulting in strong few-shot and linear probe performance. However, we find that this approach surprisingly fails in true zero-shot settings when using contrastive losses. We identify the exploitation of spurious features as being responsible for poor generalization between synthetic and real data. However, by using the image feature-based L2 distillation loss, we mitigate these problems and train students that achieve zero-shot performance which on four domain-specific datasets is on-par with a ViT-B/32 teacher model trained on DataCompXL, while featuring up to 92% fewer parameters.
Computing systems face diverse and substantial cybersecurity threats. To mitigate these cybersecurity threats, software engineers need to be competent in the skill of threat modeling. In industry and academia, there are many frameworks for teaching threat modeling, but our analysis of these frameworks suggests that (1) these approaches tend to be focused on component-level analysis rather than educating students to reason holistically about a system's cybersecurity, and (2) there is no rubric for assessing a student's threat modeling competency. To address these concerns, we propose using systems thinking in conjunction with popular and industry-standard threat modeling frameworks like STRIDE for teaching and assessing threat modeling competency. Prior studies suggest a holistic approach, like systems thinking, can help understand and mitigate cybersecurity threats. Thus, we developed and piloted two novel rubrics - one for assessing STRIDE threat modeling performance and the other for assessing systems thinking performance while conducting STRIDE. To conduct this study, we piloted the two rubrics mentioned above to assess threat model artifacts of students enrolled in an upper-level software engineering course at Purdue University in Fall 2021, Spring 2023, and Fall 2023. Students who had both systems thinking and STRIDE instruction identified and attempted to mitigate component-level as well as systems-level threats. Students with only STRIDE instruction tended to focus on identifying and mitigating component-level threats and discounted system-level threats. We contribute to engineering education by: (1) describing a new rubric for assessing threat modeling based on systems thinking; (2) identifying trends and blindspots in students' threat modeling approach; and (3) envisioning the benefits of integrating systems thinking in threat modeling teaching and assessment.
Software engineers develop, fine-tune, and deploy deep learning (DL) models using a variety of development frameworks and runtime environments. DL model converters move models between frameworks and to runtime environments. Conversion errors compromise model quality and disrupt deployment. However, the failure characteristics of DL model converters are unknown, adding risk when using DL interoperability technologies. This paper analyzes failures in DL model converters. We survey software engineers about DL interoperability tools, use cases, and pain points (N=92). Then, we characterize failures in model converters associated with the main interoperability tool, ONNX (N=200 issues in PyTorch and TensorFlow). Finally, we formulate and test two hypotheses about structural causes for the failures we studied. We find that the node conversion stage of a model converter accounts for ~75% of the defects and 33% of reported failure are related to semantically incorrect models. The cause of semantically incorrect models is elusive, but models with behaviour inconsistencies share operator sequences. Our results motivate future research on making DL interoperability software simpler to maintain, extend, and validate. Research into behavioural tolerances and architectural coverage metrics could be fruitful.
As machine learning applications continue to evolve, the demand for efficient hardware accelerators, specifically tailored for deep neural networks (DNNs), becomes increasingly vital. In this paper, we propose a configurable memory hierarchy framework tailored for per layer adaptive memory access patterns of DNNs. The hierarchy requests data on-demand from the off-chip memory to provide it to the accelerator's compute units. The objective is to strike an optimized balance between minimizing the required memory capacity and maintaining high accelerator performance. The framework is characterized by its configurability, allowing the creation of a tailored memory hierarchy with up to five levels. Furthermore, the framework incorporates an optional shift register as final level to increase the flexibility of the memory management process. A comprehensive loop-nest analysis of DNN layers shows that the framework can efficiently execute the access patterns of most loop unrolls. Synthesis results and a case study of the DNN accelerator UltraTrail indicate a possible reduction in chip area of up to 62.2% as smaller memory modules can be used. At the same time, the performance loss can be minimized to 2.4%.
This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a non-linear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
Intelligent transportation systems play a crucial role in modern traffic management and optimization, greatly improving traffic efficiency and safety. With the rapid development of generative artificial intelligence (Generative AI) technologies in the fields of image generation and natural language processing, generative AI has also played a crucial role in addressing key issues in intelligent transportation systems, such as data sparsity, difficulty in observing abnormal scenarios, and in modeling data uncertainty. In this review, we systematically investigate the relevant literature on generative AI techniques in addressing key issues in different types of tasks in intelligent transportation systems. First, we introduce the principles of different generative AI techniques, and their potential applications. Then, we classify tasks in intelligent transportation systems into four types: traffic perception, traffic prediction, traffic simulation, and traffic decision-making. We systematically illustrate how generative AI techniques addresses key issues in these four different types of tasks. Finally, we summarize the challenges faced in applying generative AI to intelligent transportation systems, and discuss future research directions based on different application scenarios.