Predicting multiple trajectories for road users is important for automated driving systems: ego-vehicle motion planning indeed requires a clear view of the possible motions of the surrounding agents. However, the generative models used for multiple-trajectory forecasting suffer from a lack of diversity in their proposals. To avoid this form of collapse, we propose a novel method for structured prediction of diverse trajectories. To this end, we complement an underlying pretrained generative model with a diversity component, based on a determinantal point process (DPP). We balance and structure this diversity with the inclusion of knowledge-based quality constraints, independent from the underlying generative model. We combine these two novel components with a gating operation, ensuring that the predictions are both diverse and within the drivable area. We demonstrate on the nuScenes driving dataset the relevance of our compound approach, which yields significant improvements in the diversity and the quality of the generated trajectories.
Standard semantic segmentation models owe their success to curated datasets with a fixed set of semantic categories, without contemplating the possibility of identifying unknown objects from novel categories. Existing methods in outlier detection suffer from a lack of smoothness and objectness in their predictions, due to limitations of the per-pixel classification paradigm. Furthermore, additional training for detecting outliers harms the performance of known classes. In this paper, we explore another paradigm with region-level classification to better segment unknown objects. We show that the object queries in mask classification tend to behave like one \vs all classifiers. Based on this finding, we propose a novel outlier scoring function called RbA by defining the event of being an outlier as being rejected by all known classes. Our extensive experiments show that mask classification improves the performance of the existing outlier detection methods, and the best results are achieved with the proposed RbA. We also propose an objective to optimize RbA using minimal outlier supervision. Further fine-tuning with outliers improves the unknown performance, and unlike previous methods, it does not degrade the inlier performance.
Trajectory optimization is a powerful tool for robot motion planning and control. State-of-the-art general-purpose nonlinear programming solvers are versatile, handle constraints in an effective way and provide a high numerical robustness, but they are slow because they do not fully exploit the optimal control problem structure at hand. Existing structure-exploiting solvers are fast but they often lack techniques to deal with nonlinearity or rely on penalty methods to enforce (equality or inequality) path constraints. This works presents FATROP: a trajectory optimization solver that is fast and benefits from the salient features of general-purpose nonlinear optimization solvers. The speed-up is mainly achieved through the use of a specialized linear solver, based on a Riccati recursion that is generalized to also support stagewise equality constraints. To demonstrate the algorithm's potential, it is benchmarked on a set of robot problems that are challenging from a numerical perspective, including problems with a minimum-time objective and no-collision constraints. The solver is shown to solve problems for trajectory generation of a quadrotor, a robot manipulator and a truck-trailer problem in a few tens of milliseconds. The algorithm's C++-code implementation accompanies this work as open source software, released under the GNU Lesser General Public License (LGPL). This software framework may encourage and enable the robotics community to use trajectory optimization in more challenging applications.
In recent years, researchers proposed a variety of deep learning models for wind power forecasting. These models predict the wind power generation of wind farms or entire regions more accurately than traditional machine learning algorithms or physical models. However, latest research has shown that deep learning models can often be manipulated by adversarial attacks. Since wind power forecasts are essential for the stability of modern power systems, it is important to protect them from this threat. In this work, we investigate the vulnerability of two different forecasting models to targeted, semitargeted, and untargeted adversarial attacks. We consider a Long Short-Term Memory (LSTM) network for predicting the power generation of a wind farm and a Convolutional Neural Network (CNN) for forecasting the wind power generation throughout Germany. Moreover, we propose the Total Adversarial Robustness Score (TARS), an evaluation metric for quantifying the robustness of regression models to targeted and semi-targeted adversarial attacks. It assesses the impact of attacks on the model's performance, as well as the extent to which the attacker's goal was achieved, by assigning a score between 0 (very vulnerable) and 1 (very robust). In our experiments, the LSTM forecasting model was fairly robust and achieved a TARS value of over 0.81 for all adversarial attacks investigated. The CNN forecasting model only achieved TARS values below 0.06 when trained ordinarily, and was thus very vulnerable. Yet, its robustness could be significantly improved by adversarial training, which always resulted in a TARS above 0.46.
Predicting the behaviour (i.e. manoeuvre/trajectory) of other road users, including vehicles, is critical for the safe and efficient operation of autonomous vehicles (AVs), a.k.a. automated driving systems (ADSs). Due to the uncertain future behaviour of vehicles, multiple future behaviour modes are often plausible for a vehicle in a given driving scene. Therefore, multimodal prediction can provide richer information than single-mode prediction enabling AVs to perform a better risk assessment. To this end, we propose a novel multimodal prediction framework that can predict multiple plausible behaviour modes and their likelihoods. The proposed framework includes a bespoke problem formulation for manoeuvre prediction, a novel transformer-based prediction model, and a tailored training method for multimodal manoeuvre and trajectory prediction. The performance of the framework is evaluated using two public benchmark highway driving datasets, namely NGSIM and highD. The results show that the proposed framework outperforms the state-of-the-art multimodal methods in the literature in terms of prediction error and is capable of predicting plausible manoeuvre and trajectory modes.
Deep learning-based human activity recognition (HAR) methods have shown great promise in the applications of smart healthcare systems and wireless body sensor network (BSN). Despite their demonstrated performance in laboratory settings, the real-world implementation of such methods is still hindered by the cross-subject issue when adapting to new users. To solve this issue, we propose ActiveSelfHAR, a framework that combines active learning's benefit of sparsely acquiring data with actual labels and self- training's benefit of effectively utilizing unlabeled data to enable the deep model to adapt to the target domain, i.e., the new users. In this framework, the model trained in the last iteration or the source domain is first utilized to generate pseudo labels of the target-domain samples and construct a self-training set based on the confidence score. Second, we propose to use the spatio-temporal relationships among the samples in the non-self-training set to augment the core set selected by active learning. Finally, we combine the self-training set and the augmented core set to fine-tune the model. We demonstrate our method by comparing it with state-of-the-art methods on two IMU-based datasets and an EMG-based dataset. Our method presents similar HAR accuracies with the upper bound, i.e. fully supervised fine-tuning with less than 1\% labeled data of the target dataset and significantly improves data efficiency and time cost. Our work highlights the potential of implementing user-independent HAR methods into smart healthcare systems and BSN.
We present the InterviewBot that dynamically integrates conversation history and customized topics into a coherent embedding space to conduct 10 mins hybrid-domain (open and closed) conversations with foreign students applying to U.S. colleges for assessing their academic and cultural readiness. To build a neural-based end-to-end dialogue model, 7,361 audio recordings of human-to-human interviews are automatically transcribed, where 440 are manually corrected for finetuning and evaluation. To overcome the input/output size limit of a transformer-based encoder-decoder model, two new methods are proposed, context attention and topic storing, allowing the model to make relevant and consistent interactions. Our final model is tested both statistically by comparing its responses to the interview data and dynamically by inviting professional interviewers and various students to interact with it in real-time, finding it highly satisfactory in fluency and context awareness.
This paper proposes a computational framework for the design optimization of stable structures under large deformations by incorporating nonlinear buckling constraints. A novel strategy for suppressing spurious buckling modes related to low-density elements is proposed. The strategy depends on constructing a pseudo-mass matrix that assigns small pseudo masses for DOFs surrounded by only low-density elements and degenerates to an identity matrix for the solid region. A novel optimization procedure is developed that can handle both simple and multiple eigenvalues wherein consistent sensitivities of simple eigenvalues and directional derivatives of multiple eigenvalues are derived and utilized in a gradient-based optimization algorithm - the method of moving asymptotes. An adaptive linear energy interpolation method is also incorporated in nonlinear analyses to handle the low-density elements distortion under large deformations. The numerical results demonstrate that, for systems with either low or high symmetries, the nonlinear stability constraints can ensure structural stability at the target load under large deformations. Post-analysis on the B-spline fitted designs shows that the safety margin, i.e., the gap between the target load and the 1st critical load, of the optimized structures can be well controlled by selecting different stability constraint values. Interesting structural behaviors such as mode switching and multiple bifurcations are also demonstrated.
In this paper, a sampling-based trajectory planning algorithm for a laboratory-scale 3D gantry crane in an environment with static obstacles and subject to bounds on the velocity and acceleration of the gantry crane system is presented. The focus is on developing a fast motion planning algorithm for differentially flat systems, where intermediate results can be stored and reused for further tasks, such as replanning. The proposed approach is based on the informed optimal rapidly exploring random tree algorithm (informed RRT*), which is utilized to build trajectory trees that are reused for replanning when the start and/or target states change. In contrast to state-of-the-art approaches, the proposed motion planning algorithm incorporates a linear quadratic minimum time (LQTM) local planner. Thus, dynamic properties such as time optimality and the smoothness of the trajectory are directly considered in the proposed algorithm. Moreover, by integrating the branch-and-bound method to perform the pruning process on the trajectory tree, the proposed algorithm can eliminate points in the tree that do not contribute to finding better solutions. This helps to curb memory consumption and reduce the computational complexity during motion (re)planning. Simulation results for a validated mathematical model of a 3D gantry crane show the feasibility of the proposed approach.
Matrix Lie groups are an important class of manifolds commonly used in control and robotics, and the optimization of control policies on these manifolds is a fundamental problem. In this work, we propose a novel approach for trajectory optimization on matrix Lie groups using an augmented Lagrangian-based constrained discrete Differential Dynamic Programming. The method involves lifting the optimization problem to the Lie algebra in the backward pass and retracting back to the manifold in the forward pass. In contrast to previous approaches which only addressed constraint handling for specific classes of matrix Lie groups, the proposed method provides a general approach for nonlinear constraint handling for generic matrix Lie groups. We also demonstrate the effectiveness of the method in handling external disturbances through its application as a Lie-algebraic feedback control policy on SE(3). Experiments show that the approach is able to effectively handle configuration, velocity and input constraints and maintain stability in the presence of external disturbances.
Accurate and robust trajectory prediction of neighboring agents is critical for autonomous vehicles traversing in complex scenes. Most methods proposed in recent years are deep learning-based due to their strength in encoding complex interactions. However, unplausible predictions are often generated since they rely heavily on past observations and cannot effectively capture the transient and contingency interactions from sparse samples. In this paper, we propose a hierarchical hybrid framework of deep learning (DL) and reinforcement learning (RL) for multi-agent trajectory prediction, to cope with the challenge of predicting motions shaped by multi-scale interactions. In the DL stage, the traffic scene is divided into multiple intermediate-scale heterogenous graphs based on which Transformer-style GNNs are adopted to encode heterogenous interactions at intermediate and global levels. In the RL stage, we divide the traffic scene into local sub-scenes utilizing the key future points predicted in the DL stage. To emulate the motion planning procedure so as to produce trajectory predictions, a Transformer-based Proximal Policy Optimization (PPO) incorporated with a vehicle kinematics model is devised to plan motions under the dominant influence of microscopic interactions. A multi-objective reward is designed to balance between agent-centric accuracy and scene-wise compatibility. Experimental results show that our proposal matches the state-of-the-arts on the Argoverse forecasting benchmark. It's also revealed by the visualized results that the hierarchical learning framework captures the multi-scale interactions and improves the feasibility and compliance of the predicted trajectories.