Path planners based on basic rapidly-exploring random trees (RRTs) are quick and efficient, and thus favourable for real-time robot path planning, but are almost-surely suboptimal. In contrast, the optimal RRT (RRT*) converges to the optimal solution, but may be expensive in practice. Recent work has focused on accelerating the RRT*'s convergence rate. The most successful strategies are informed sampling, path optimisation, and a combination thereof. However, informed sampling and its combination with path optimisation have not been applied to the basic RRT. Moreover, while a number of path optimisers can be used to accelerate the convergence rate, a comparison of their effectiveness is lacking. This paper investigates the use of informed sampling and path optimisation to accelerate planners based on both the basic RRT and the RRT*, resulting in a family of algorithms known as optimised informed RRTs. We apply different path optimisers and compare their effectiveness. The goal is to ascertain if applying informed sampling and path optimisation can help the quick, though almost-surely suboptimal, path planners based on the basic RRT attain comparable or better performance than RRT*-based planners. Analyses show that RRT-based optimised informed RRTs do attain better performance than their RRT*-based counterparts, both when planning time is limited and when there is more planning time.
Sparse arrays enable resolving more direction of arrivals (DoAs) than antenna elements using non-uniform arrays. This is typically achieved by reconstructing the covariance of a virtual large uniform linear array (ULA), which is then processed by subspace DoA estimators. However, these method assume that the signals are non-coherent and the array is calibrated; the latter often challenging to achieve in sparse arrays, where one cannot access the virtual array elements. In this work, we propose Sparse-SubspaceNet, which leverages deep learning to enable subspace-based DoA recovery from sparse miscallibrated arrays with coherent sources. Sparse- SubspaceNet utilizes a dedicated deep network to learn from data how to compute a surrogate virtual array covariance that is divisible into distinguishable subspaces. By doing so, we learn to cope with coherent sources and miscalibrated sparse arrays, while preserving the interpretability and the suitability of model-based subspace DoA estimators.
Deducing the 3D structure of endoscopic scenes from images is exceedingly challenging. In addition to deformation and view-dependent lighting, tubular structures like the colon present problems stemming from their self-occluding and repetitive anatomical structure. In this paper, we propose SimCol, a synthetic dataset for camera pose estimation in colonoscopy, and a novel method that explicitly learns a bimodal distribution to predict the endoscope pose. Our dataset replicates real colonoscope motion and highlights the drawbacks of existing methods. We publish 18k RGB images from simulated colonoscopy with corresponding depth and camera poses and make our data generation environment in Unity publicly available. We evaluate different camera pose prediction methods and demonstrate that, when trained on our data, they generalize to real colonoscopy sequences, and our bimodal approach outperforms prior unimodal work.
This paper presents a motion planning algorithm for quadruped locomotion based on density functions. We decompose the locomotion problem into a high-level density planner and a model predictive controller (MPC). Due to density functions having a physical interpretation through the notion of occupancy, it is intuitive to represent the environment with safety constraints. Hence, there is an ease of use to constructing the planning problem with density. The proposed method uses a simplified model of the robot into an integrator system, where the high-level plan is in a feedback form formulated through an analytically constructed density function. We then use the MPC to optimize the reference trajectory, in which a low-level PID controller is used to obtain the torque level control. The overall framework is implemented in simulation, demonstrating our feedback density planner for legged locomotion. The implementation of work is available at \url{//github.com/AndrewZheng-1011/legged_planner}
Dataset distillation is the technique of synthesizing smaller condensed datasets from large original datasets while retaining necessary information to persist the effect. In this paper, we approach the dataset distillation problem from a novel perspective: we regard minimizing the prediction discrepancy on the real data distribution between models, which are respectively trained on the large original dataset and on the small distilled dataset, as a conduit for condensing information from the raw data into the distilled version. An adversarial framework is proposed to solve the problem efficiently. In contrast to existing distillation methods involving nested optimization or long-range gradient unrolling, our approach hinges on single-level optimization. This ensures the memory efficiency of our method and provides a flexible tradeoff between time and memory budgets, allowing us to distil ImageNet-1K using a minimum of only 6.5GB of GPU memory. Under the optimal tradeoff strategy, it requires only 2.5$\times$ less memory and 5$\times$ less runtime compared to the state-of-the-art. Empirically, our method can produce synthetic datasets just 10% the size of the original, yet achieve, on average, 94% of the test accuracy of models trained on the full original datasets including ImageNet-1K, significantly surpassing state-of-the-art. Additionally, extensive tests reveal that our distilled datasets excel in cross-architecture generalization capabilities.
L1-norm regularized logistic regression models are widely used for analyzing data with binary response. In those analyses, fusing regression coefficients is useful for detecting groups of variables. This paper proposes a binomial logistic regression model with Bayesian fused lasso. Assuming a Laplace prior on regression coefficients and differences between adjacent regression coefficients enables us to perform variable selection and variable fusion simultaneously in the Bayesian framework. We also propose assuming a horseshoe prior on the differences to improve the flexibility of variable fusion. The Gibbs sampler is derived to estimate the parameters by a hierarchical expression of priors and a data-augmentation method. Using simulation studies and real data analysis, we compare the proposed methods with the existing method.
Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. Our implicit field decoder is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our decoder for representation learning and generative modeling of shapes, we demonstrate superior results for tasks such as shape autoencoding, generation, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.
Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial examples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate adversarial perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply AdvGAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.