Time-optimal path planning in high winds for a turning-rate constrained UAV is a challenging problem to solve and is important for deployment and field operations. Previous works have used trochoidal path segments comprising straight and maximum-rate turn segments, as optimal extremal paths in uniform wind conditions. Current methods iterate over all candidate trochoidal trajectory types and select the one that is time-optimal; however, this exhaustive search can be computationally slow. In this paper, we introduce a method to decrease the computation time. This is achieved by reducing the number of candidate trochoidal trajectory types by framing the problem in the air-relative frame and bounding the solution within a subset of candidate trajectories. Our method reduces overall computation by 37.4% compared to pre-existing methods in Bang-Straight-Bang trajectories, freeing up computation for other onboard processes and can lead to significant total computational reductions when solving many trochoidal paths. When used within the framework of a global path planner, faster state expansions help find solutions faster or compute higher-quality paths. We also release our open-source codebase as a C++ package. The website and demo can be bound at //bradymoon.com/trochoids, codebase at //github.com/castacks/trochoids, and video at //youtu.be/qOU5gI7JshI .
Targetless IMU-LiDAR extrinsic calibration methods are gaining significant attention as the importance of the IMU-LiDAR fusion system increases. Notably, existing calibration methods derive calibration parameters under the assumption that the methods require full motion in all axes. When IMU and LiDAR are mounted on a ground robot the motion of which is restricted to planar motion, existing calibration methods are likely to exhibit degraded performance. To address this issue, we present GRIL-Calib: a novel targetless Ground Robot IMU-LiDAR Calibration method. Our proposed method leverages ground information to compensate for the lack of unrestricted full motion. First, we propose LiDAR Odometry (LO) using ground plane residuals to enhance calibration accuracy. Second, we propose the Ground Plane Motion (GPM) constraint and incorporate it into the optimization for calibration, enabling the determination of full 6-DoF extrinsic parameters, including theoretically unobservable direction. Finally, unlike baseline methods, we formulate the calibration not as sequential two optimizations but as a single optimization (SO) problem, solving all calibration parameters simultaneously and improving accuracy. We validate our \textit{GRIL-Calib} by applying it to three public real-world datasets and comparing its performance with that of existing state-of-the-art methods in terms of accuracy and robustness. Our code is available at //github.com/Taeyoung96/GRIL-Calib.
We propose a two-stage estimation procedure for a copula-based model with semi-competing risks data, where the non-terminal event is subject to dependent censoring by the terminal event, and both events are subject to independent censoring. Under a copula-based model, the marginal survival functions of individual event times are specified by semiparametric transformation models, and the dependence between the bivariate event times is specified by a parametric copula function. For the estimation procedure, in the first stage, the parameters associated with the marginal of the terminal event are estimated only using the corresponding observed outcomes, and in the second stage, the marginal parameters for the non-terminal event time and the copula parameter are estimated via maximizing a pseudo-likelihood function based on the joint distribution of the bivariate event times. We derived the asymptotic properties of the proposed estimator and provided an analytic variance estimator for inference. Through simulation studies, we showed that our approach leads to consistent estimates with less computational cost and more robustness compared to the one-stage procedure developed in Chen (2012), where all parameters were estimated simultaneously. In addition, our approach demonstrates more desirable finite-sample performances over another existing two-stage estimation method proposed in Zhu et al. (2021).
Speech emotion recognition (SER) performance deteriorates significantly in the presence of noise, making it challenging to achieve competitive performance in noisy conditions. To this end, we propose a multi-level knowledge distillation (MLKD) method, which aims to transfer the knowledge from a teacher model trained on clean speech to a simpler student model trained on noisy speech. Specifically, we use clean speech features extracted by the wav2vec-2.0 as the learning goal and train the distil wav2vec-2.0 to approximate the feature extraction ability of the original wav2vec-2.0 under noisy conditions. Furthermore, we leverage the multi-level knowledge of the original wav2vec-2.0 to supervise the single-level output of the distil wav2vec-2.0. We evaluate the effectiveness of our proposed method by conducting extensive experiments using five types of noise-contaminated speech on the IEMOCAP dataset, which show promising results compared to state-of-the-art models.
We perform a set of flow and reactive transport simulations within three-dimensional fracture networks to learn the factors controlling mineral reactions. CO$_2$ mineralization requires CO$_2$-laden water, dissolution of a mineral that then leads to precipitation of a CO$_2$-bearing mineral. Our discrete fracture networks (DFN) are partially filled with quartz that gradually dissolves until it reaches a quasi-steady state. At the end of the simulation, we measure the quartz remaining in each fracture within the domain. We observe that a small backbone of fracture exists, where the quartz is fully dissolved which leads to increased flow and transport. However, depending on the DFN topology and the rate of dissolution, we observe a large variability of these changes, which indicates an interplay between the fracture network structure and the impact of geochemical dissolution. In this work, we developed a machine learning framework to extract the important features that support mineralization in the form of dissolution. In addition, we use structural and topological features of the fracture network to predict the remaining quartz volume in quasi-steady state conditions. As a first step to characterizing carbon mineralization, we study dissolution with this framework. We studied a variety of reaction and fracture parameters and their impact on the dissolution of quartz in fracture networks. We found that the dissolution reaction rate constant of quartz and the distance to the flowing backbone in the fracture network are the two most important features that control the amount of quartz left in the system. For the first time, we use a combination of a finite-volume reservoir model and graph-based approach to study reactive transport in a complex fracture network to determine the key features that control dissolution.
In rapidly-evolving domains such as autonomous driving, the use of multiple sensors with different modalities is crucial to ensure high operational precision and stability. To correctly exploit the provided information by each sensor in a single common frame, it is essential for these sensors to be accurately calibrated. In this paper, we leverage the ability of Neural Radiance Fields (NeRF) to represent different sensors modalities in a common volumetric representation to achieve robust and accurate spatio-temporal sensor calibration. By designing a partitioning approach based on the visible part of the scene for each sensor, we formulate the calibration problem using only the overlapping areas. This strategy results in a more robust and accurate calibration that is less prone to failure. We demonstrate that our approach works on outdoor urban scenes by validating it on multiple established driving datasets. Results show that our method is able to get better accuracy and robustness compared to existing methods.
The problem of identifying the k-shortest paths KSPs for short in a dynamic road network is essential to many location-based services. Road networks are dynamic in the sense that the weights of the edges in the corresponding graph constantly change over time, representing evolving traffic conditions. Very often such services have to process numerous KSP queries over large road networks at the same time, thus there is a pressing need to identify distributed solutions for this problem. However, most existing approaches are designed to identify KSPs on a static graph in a sequential manner, restricting their scalability and applicability in a distributed setting. We therefore propose KSP-DG, a distributed algorithm for identifying k-shortest paths in a dynamic graph. It is based on partitioning the entire graph into smaller subgraphs, and reduces the problem of determining KSPs into the computation of partial KSPs in relevant subgraphs, which can execute in parallel on a cluster of servers. A distributed two-level index called DTLP is developed to facilitate the efficient identification of relevant subgraphs. A salient feature of DTLP is that it indexes a set of virtual paths that are insensitive to varying traffic conditions in an efficient and compact fashion, leading to very low maintenance cost in dynamic road networks. This is the first treatment of the problem of processing KSP queries over dynamic road networks. Extensive experiments conducted on real road networks confirm the superiority of our proposal over baseline methods.
For robotic decision-making under uncertainty, the balance between exploitation and exploration of available options must be carefully taken into account. In this study, we introduce a new variant of contextual multi-armed bandits called observation-augmented CMABs (OA-CMABs) wherein a decision-making agent can utilize extra outcome observations from an external information source. CMABs model the expected option outcomes as a function of context features and hidden parameters, which are inferred from previous option outcomes. In OA-CMABs, external observations are also a function of context features and thus provide additional evidence about the hidden parameters. Yet, if an external information source is error-prone, the resulting posterior updates can harm decision-making performance unless the presence of errors is considered. To this end, we propose a robust Bayesian inference process for OA-CMABs that is based on the concept of probabilistic data validation. Our approach handles complex mixture model parameter priors and hybrid observation likelihoods for semantic data sources, allowing us to develop validation algorithms based on recently develop probabilistic semantic data association techniques. Furthermore, to more effectively cope with the combined sources of uncertainty in OA-CMABs, we derive a new active inference algorithm for option selection based on expected free energy minimization. This generalizes previous work on active inference for bandit-based robotic decision-making by accounting for faulty observations and non-Gaussian inference. Our approaches are demonstrated on a simulated asynchronous search site selection problem for space exploration. The results show that even if incorrect observations are provided by external information sources, efficient decision-making and robust parameter inference are still achieved in a wide variety of experimental conditions.
Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.
Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.