We consider the motion planning problem under uncertainty and address it using probabilistic inference. A collision-free motion plan with linear stochastic dynamics is modeled by a posterior distribution. Gaussian variational inference is an optimization over the path distributions to infer this posterior within the scope of Gaussian distributions. We propose Gaussian Variational Inference Motion Planner (GVI-MP) algorithm to solve this Gaussian inference, where a natural gradient paradigm is used to iteratively update the Gaussian distribution, and the factorized structure of the joint distribution is leveraged. We show that the direct optimization over the state distributions in GVI-MP is equivalent to solving a stochastic control that has a closed-form solution. Starting from this observation, we propose our second algorithm, Proximal Gradient Covariance Steering Motion Planner (PGCS-MP), to solve the same inference problem in its stochastic control form with terminal constraints. We use a proximal gradient paradigm to solve the linear stochastic control with nonlinear collision cost, where the nonlinear cost is iteratively approximated using quadratic functions and a closed-form solution can be obtained by solving a linear covariance steering at each iteration. We evaluate the effectiveness and the performance of the proposed approaches through extensive experiments on various robot models. The code for this paper can be found in //github.com/hzyu17/VIMP.
Trade restrictions, the COVID-19 pandemic, and geopolitical conflicts has significantly exposed vulnerabilities within traditional global supply chains. These events underscore the need for organisations to establish more resilient and flexible supply chains. To address these challenges, the concept of the autonomous supply chain (ASC), characterised by predictive and self-decision-making capabilities, has recently emerged as promising solution. However, research on ASCs is relatively limited, with no existing studies on their implementations. This paper aims to address this gap by presenting an implementation of ASC using a multi-agent approach. It proposes a methodology for the analysis and design of such an agent-based ASC system (A2SC). This paper provides a concrete case study, the autonomous meat supply chain, which showcases the practical implementation of the A2SC system using the proposed methodology. Additionally, a system architecture and a toolkit for developing A2SC systems are presented. Despite with limitations, this paper demonstrates a promising approach for implementing an effective ASC system.
Overparametrization often helps improve the generalization performance. This paper presents a dual view of overparametrization suggesting that downsampling may also help generalize. Focusing on the proportional regime $m\asymp n \asymp p$, where $m$ represents the sketching size, $n$ is the sample size, and $p$ is the feature dimensionality, we investigate two out-of-sample prediction risks of the sketched ridgeless least square estimator. Our findings challenge conventional beliefs by showing that downsampling does not always harm generalization but can actually improve it in certain cases. We identify the optimal sketching size that minimizes out-of-sample prediction risks and demonstrate that the optimally sketched estimator exhibits stabler risk curves, eliminating the peaks of those for the full-sample estimator. To facilitate practical implementation, we propose an empirical procedure to determine the optimal sketching size. Finally, we extend our analysis to cover central limit theorems and misspecified models. Numerical studies strongly support our theory.
In the field of robotics, robot teleoperation for remote or hazardous environments has become increasingly vital. A major challenge is the lag between command and action, negatively affecting operator awareness, performance, and mental strain. Even with advanced technology, mitigating these delays, especially in long-distance operations, remains challenging. Current solutions largely focus on machine-based adjustments. Yet, there's a gap in using human perceptions to improve the teleoperation experience. This paper presents a unique method of sensory manipulation to help humans adapt to such delays. Drawing from motor learning principles, it suggests that modifying sensory stimuli can lessen the perception of these delays. Instead of introducing new skills, the approach uses existing motor coordination knowledge. The aim is to minimize the need for extensive training or complex automation. A study with 41 participants explored the effects of altered haptic cues in delayed teleoperations. These cues were sourced from advanced physics engines and robot sensors. Results highlighted benefits like reduced task time and improved perceptions of visual delays. Real-time haptic feedback significantly contributed to reduced mental strain and increased confidence. This research emphasizes human adaptation as a key element in robot teleoperation, advocating for improved teleoperation efficiency via swift human adaptation, rather than solely optimizing robots for delay adjustment.
Discovery of new materials has a documented history of propelling human progress for centuries and more. The behaviour of a material is a function of its composition, structure, and properties, which further depend on its processing and testing conditions. Recent developments in deep learning and natural language processing have enabled information extraction at scale from published literature such as peer-reviewed publications, books, and patents. However, this information is spread in multiple formats, such as tables, text, and images, and with little or no uniformity in reporting style giving rise to several machine learning challenges. Here, we discuss, quantify, and document these outstanding challenges in automated information extraction (IE) from materials science literature towards the creation of a large materials science knowledge base. Specifically, we focus on IE from text and tables and outline several challenges with examples. We hope the present work inspires researchers to address the challenges in a coherent fashion, providing to fillip to IE for the materials knowledge base.
Spatial attention has been widely used to improve the performance of convolutional neural networks. However, it has certain limitations. In this paper, we propose a new perspective on the effectiveness of spatial attention, which is that the spatial attention mechanism essentially solves the problem of convolutional kernel parameter sharing. However, the information contained in the attention map generated by spatial attention is not sufficient for large-size convolutional kernels. Therefore, we propose a novel attention mechanism called Receptive-Field Attention (RFA). Existing spatial attention, such as Convolutional Block Attention Module (CBAM) and Coordinated Attention (CA) focus only on spatial features, which does not fully address the problem of convolutional kernel parameter sharing. In contrast, RFA not only focuses on the receptive-field spatial feature but also provides effective attention weights for large-size convolutional kernels. The Receptive-Field Attention convolutional operation (RFAConv), developed by RFA, represents a new approach to replace the standard convolution operation. It offers nearly negligible increment of computational cost and parameters, while significantly improving network performance. We conducted a series of experiments on ImageNet-1k, COCO, and VOC datasets to demonstrate the superiority of our approach. Of particular importance, we believe that it is time to shift focus from spatial features to receptive-field spatial features for current spatial attention mechanisms. In this way, we can further improve network performance and achieve even better results. The code and pre-trained models for the relevant tasks can be found at //github.com/Liuchen1997/RFAConv.
These are self-contained lecture notes for spectral independence. For an $n$-vertex graph, the spectral independence condition is a bound on the maximum eigenvalue of the $n\times n$ influence matrix whose entries capture the influence between pairs of vertices, it is closely related to the covariance matrix. We will present recent results showing that spectral independence implies the mixing time of the Glauber dynamics is polynomial (where the degree of the polynomial depends on certain parameters). The proof utilizes local-to-global theorems which we will detail in these notes. Finally, we will present more recent results showing that spectral independence implies an optimal bound on the relaxation time (inverse spectral gap) and with some additional conditions implies an optimal mixing time bound of $O(n\log{n})$ for the Glauber dynamics. We also present the results of Anari, Liu, Oveis Gharan, and Vinzant (2019) for generating a random basis of a matroid. The analysis of the associated bases-exchange walk utilizes the local-to-global theorems used for spectral independence with the Trickle-Down Theorem of Oppenheim (2018) to analyze the local walks. Our focus in these notes is on the analysis of the spectral gap of the associated Markov chains from a functional analysis perspective, and we present proofs of the associated local-to-global theorems from this same Markov chain perspective.
High-dimensional motion generation requires numerical precision for smooth, collision-free solutions. Typically, double-precision or single-precision floating-point (FP) formats are utilized. Using these for big tensors imposes a strain on the memory bandwidth provided by the devices and alters the memory footprint, hence limiting their applicability to low-power edge devices needed for mobile robots. The uniform application of reduced precision can be advantageous but severely degrades solutions. Using decreased precision data types for important tensors, we propose to accelerate motion generation by removing memory bottlenecks. We propose variable-precision (VaPr) search optimization to determine the appropriate precision for large tensors from a vast search space of approximately 4 million unique combinations for FP data types across the tensors. To obtain the efficiency gains, we exploit existing platform support for an out-of-the-box GPU speedup and evaluate prospective precision converter units for GPU types that are not currently supported. Our experimental results on 800 planning problems for the Franka Panda robot on the MotionBenchmaker dataset across 8 environments show that a 4-bit FP format is sufficient for the largest set of tensors in the motion generation stack. With the software-only solution, VaPr achieves 6.3% and 6.3% speedups on average for a significant portion of motion generation over the SOTA solution (CuRobo) on Jetson Orin and RTX2080 Ti GPU, respectively, and 9.9%, 17.7% speedups with the FP converter.
Defensive deception is a promising approach for cyberdefense. Although defensive deception is increasingly popular in the research community, there has not been a systematic investigation of its key components, the underlying principles, and its tradeoffs in various problem settings. This survey paper focuses on defensive deception research centered on game theory and machine learning, since these are prominent families of artificial intelligence approaches that are widely employed in defensive deception. This paper brings forth insights, lessons, and limitations from prior work. It closes with an outline of some research directions to tackle major gaps in current defensive deception research.
Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are: (1) the generation of high quality images, (2) diversity of image generation, and (3) stable training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state of the art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress towards important computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Code related to GAN-variants studied in this work is summarized on //github.com/sheqi/GAN_Review.
Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably the revolutionary techniques are in the area of computer vision such as plausible image generation, image to image translation, facial attribute manipulation and similar domains. Despite the significant success achieved in computer vision field, applying GANs over real-world problems still have three main challenges: (1) High quality image generation; (2) Diverse image generation; and (3) Stable training. Considering numerous GAN-related research in the literature, we provide a study on the architecture-variants and loss-variants, which are proposed to handle these three challenges from two perspectives. We propose loss and architecture-variants for classifying most popular GANs, and discuss the potential improvements with focusing on these two aspects. While several reviews for GANs have been presented, there is no work focusing on the review of GAN-variants based on handling challenges mentioned above. In this paper, we review and critically discuss 7 architecture-variant GANs and 9 loss-variant GANs for remedying those three challenges. The objective of this review is to provide an insight on the footprint that current GANs research focuses on the performance improvement. Code related to GAN-variants studied in this work is summarized on //github.com/sheqi/GAN_Review.