Ground reaction force sensing is one of the key components of gait analysis in legged locomotion research. To measure continuous force data during locomotion, we present a novel compound instrumented treadmill design. The treadmill is 1.7m long, with a natural frequency of 170Hz and an adjustable range that can be used for humans and small robots alike. Here, we present the treadmill's design methodology and characterize it in its natural frequency, noise behavior and real-life performance. Additionally, we apply an ISO 376 norm conform calibration procedure for all spatial force directions and center of pressure position. We achieve a force accuracy of $\leq$5.6N for the ground reaction forces and $\leq$13mm in center of pressure position.
Accurate prediction of aerodynamic forces in real-time is crucial for autonomous navigation of unmanned aerial vehicles (UAVs). This paper presents a data-driven aerodynamic force prediction model based on a small number of pressure sensors located on the surface of UAV. The model is built on a linear term that can make a reasonably accurate prediction and a nonlinear correction for accuracy improvement. The linear term is based on a reduced basis reconstruction of the surface pressure distribution, where the basis is extracted from numerical simulation data and the basis coefficients are determined by solving linear pressure reconstruction equations at a set of sensor locations. Sensor placement is optimized using the discrete empirical interpolation method (DEIM). Aerodynamic forces are computed by integrating the reconstructed surface pressure distribution. The nonlinear term is an artificial neural network (NN) that is trained to bridge the gap between the ground truth and the DEIM prediction, especially in the scenario where the DEIM model is constructed from simulation data with limited fidelity. A large network is not necessary for accurate correction as the linear model already captures the main dynamics of the surface pressure field, thus yielding an efficient DEIM+NN aerodynamic force prediction model. The model is tested on numerical and experimental dynamic stall data of a 2D NACA0015 airfoil, and numerical simulation data of dynamic stall of a 3D drone. Numerical results demonstrate that the machine learning enhanced model can make fast and accurate predictions of aerodynamic forces using only a few pressure sensors, even for the NACA0015 case in which the simulations do not agree well with the wind tunnel experiments. Furthermore, the model is robust to noise.
Range-only (RO) localization involves determining the position of a mobile robot by measuring the distance to specific anchors. RO localization is challenging since the measurements are low-dimensional and a single range sensor does not have enough information to estimate the full pose of the robot. As such, range sensors are typically coupled with other sensing modalities such as wheel encoders or inertial measurement units (IMUs) to estimate the full pose. In this work, we propose a continuous-time Gaussian process (GP)- based trajectory estimation method to estimate the full pose of a robot using only range measurements from multiple range sensors. Results from simulation and real experiments show that our proposed method, using off-the-shelf range sensors, is able to achieve comparable performance and in some cases outperform alternative state-of-the-art sensor-fusion methods that use additional sensing modalities.
Scaling to arbitrarily large bundle adjustment problems requires data and compute to be distributed across multiple devices. Centralized methods in prior works are only able to solve small or medium size problems due to overhead in computation and communication. In this paper, we present a fully decentralized method that alleviates computation and communication bottlenecks to solve arbitrarily large bundle adjustment problems. We achieve this by reformulating the reprojection error and deriving a novel surrogate function that decouples optimization variables from different devices. This function makes it possible to use majorization minimization techniques and reduces bundle adjustment to independent optimization subproblems that can be solved in parallel. We further apply Nesterov's acceleration and adaptive restart to improve convergence while maintaining its theoretical guarantees. Despite limited peer-to-peer communication, our method has provable convergence to first-order critical points under mild conditions. On extensive benchmarks with public datasets, our method converges much faster than decentralized baselines with similar memory usage and communication load. Compared to centralized baselines using a single device, our method, while being decentralized, yields more accurate solutions with significant speedups of up to 953.7x over Ceres and 174.6x over DeepLM. Code: //github.com/facebookresearch/DABA.
Simultaneous Localization and Mapping (SLAM) is being deployed in real-world applications, however many state-of-the-art solutions still struggle in many common scenarios. A key necessity in progressing SLAM research is the availability of high-quality datasets and fair and transparent benchmarking. To this end, we have created the Hilti-Oxford Dataset, to push state-of-the-art SLAM systems to their limits. The dataset has a variety of challenges ranging from sparse and regular construction sites to a 17th century neoclassical building with fine details and curved surfaces. To encourage multi-modal SLAM approaches, we designed a data collection platform featuring a lidar, five cameras, and an IMU (Inertial Measurement Unit). With the goal of benchmarking SLAM algorithms for tasks where accuracy and robustness are paramount, we implemented a novel ground truth collection method that enables our dataset to accurately measure SLAM pose errors with millimeter accuracy. To further ensure accuracy, the extrinsics of our platform were verified with a micrometer-accurate scanner, and temporal calibration was managed online using hardware time synchronization. The multi-modality and diversity of our dataset attracted a large field of academic and industrial researchers to enter the second edition of the Hilti SLAM challenge, which concluded in June 2022. The results of the challenge show that while the top three teams could achieve an accuracy of 2cm or better for some sequences, the performance dropped off in more difficult sequences.
We present GPS++, a hybrid Message Passing Neural Network / Graph Transformer model for molecular property prediction. Our model integrates a well-tuned local message passing component and biased global attention with other key ideas from prior literature to achieve state-of-the-art results on large-scale molecular dataset PCQM4Mv2. Through a thorough ablation study we highlight the impact of individual components and find that nearly all of the model's performance can be maintained without any use of global self-attention, showing that message passing is still a competitive approach for 3D molecular property prediction despite the recent dominance of graph transformers. We also find that our approach is significantly more accurate than prior art when 3D positional information is not available.
Object SLAM is considered increasingly significant for robot high-level perception and decision-making. Existing studies fall short in terms of data association, object representation, and semantic mapping and frequently rely on additional assumptions, limiting their performance. In this paper, we present a comprehensive object SLAM framework that focuses on object-based perception and object-oriented robot tasks. First, we propose an ensemble data association approach for associating objects in complicated conditions by incorporating parametric and nonparametric statistic testing. In addition, we suggest an outlier-robust centroid and scale estimation algorithm for modeling objects based on the iForest and line alignment. Then a lightweight and object-oriented map is represented by estimated general object models. Taking into consideration the semantic invariance of objects, we convert the object map to a topological map to provide semantic descriptors to enable multi-map matching. Finally, we suggest an object-driven active exploration strategy to achieve autonomous mapping in the grasping scenario. A range of public datasets and real-world results in mapping, augmented reality, scene matching, relocalization, and robotic manipulation have been used to evaluate the proposed object SLAM framework for its efficient performance.
3D spatial perception is the problem of building and maintaining an actionable and persistent representation of the environment in real-time using sensor data and prior knowledge. Despite the fast-paced progress in robot perception, most existing methods either build purely geometric maps (as in traditional SLAM) or flat metric-semantic maps that do not scale to large environments or large dictionaries of semantic labels. The first part of this paper is concerned with representations: we show that scalable representations for spatial perception need to be hierarchical in nature. Hierarchical representations are efficient to store, and lead to layered graphs with small treewidth, which enable provably efficient inference. We then introduce an example of hierarchical representation for indoor environments, namely a 3D scene graph, and discuss its structure and properties. The second part of the paper focuses on algorithms to incrementally construct a 3D scene graph as the robot explores the environment. Our algorithms combine 3D geometry, topology (to cluster the places into rooms), and geometric deep learning (e.g., to classify the type of rooms the robot is moving across). The third part of the paper focuses on algorithms to maintain and correct 3D scene graphs during long-term operation. We propose hierarchical descriptors for loop closure detection and describe how to correct a scene graph in response to loop closures, by solving a 3D scene graph optimization problem. We conclude the paper by combining the proposed perception algorithms into Hydra, a real-time spatial perception system that builds a 3D scene graph from visual-inertial data in real-time. We showcase Hydra's performance in photo-realistic simulations and real data collected by a Clearpath Jackal robots and a Unitree A1 robot. We release an open-source implementation of Hydra at //github.com/MIT-SPARK/Hydra.
Patankar schemes have attracted increasing interest in recent years because they preserve the positivity of the analytical solution of a production-destruction system (PDS) irrespective of the chosen time step size. Although they are now of great interest, for a long time it was not clear what stability properties such schemes have. Recently a new stability approach based on Lyapunov stability with an extension of the center manifold theorem has been proposed to study the stability properties of positivity-preserving time integrators. In this work, we study the stability properties of the classical modified Patankar--Runge--Kutta schemes (MPRK) and the modified Patankar Deferred Correction (MPDeC) approaches. We prove that most of the considered MPRK schemes are stable for any time step size and compute the stability function of MPDeC. We investigate its properties numerically revealing that also most MPDeC are stable irrespective of the chosen time step size. Finally, we verify our theoretical results with numerical simulations.
This paper focuses on statistical modelling using additive Gaussian process (GP) models and their efficient implementation for large-scale spatio-temporal data with a multi-dimensional grid structure. To achieve this, we exploit the Kronecker product structures of the covariance kernel. While this method has gained popularity in the GP literature, the existing approach is limited to covariance kernels with a tensor product structure and does not allow flexible modelling and selection of interaction effects. This is considered an important component in spatio-temporal analysis. We extend the method to a more general class of additive GP models that accounts for main effects and selected interaction effects. Our approach allows for easy identification and interpretation of interaction effects. The proposed model is applied to the analysis of NO$_2$ concentrations during the COVID-19 lockdown in London. Our scalable method enables analysis of large-scale, hourly-recorded data collected from 59 different stations across the city, providing additional insights to findings from previous research using daily or weekly averaged data.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.