Target detection is a basic task to divide the object types in the orchard point cloud global map, which is used to count the overall situation of the orchard. And provide necessary information for unmanned navigation planning of agricultural vehicles. In order to divide the fruit trees and the ground in the point cloud global map of the standardized orchard, and provide the orchard overall information for the path planning of autonomous vehicles in the natural orchard environment. A fruit tree detection method based on the Yolo-V7 network is proposed, which can effectively detect fruit tree targets from multi-sensor fused radar point cloud, reduce the 3D point cloud information of the point cloud map to 2D for the fruit tree point cloud in the Yolo-V7 network detection map, and project the prediction results into the point cloud map. Generally, the target detection network based on PointNet has the problem of low speed and large computational load. The method proposed in this paper is fast and low computational load and is suitable for deployment in mobile robots. From the experimental results, the recall rate and accuracy rate of the proposed method in orchard fruit tree detection are 0.4 and 0.696 respectively, and its weight and reasoning time are 7.4 M and 28 ms respectively. The experimental results show that this method can achieve the robustness and efficiency of real-time detection of orchard fruit trees.
Identifying defect patterns in a wafer map during manufacturing is crucial to find the root cause of the underlying issue and provides valuable insights on improving yield in the foundry. Currently used methods use deep neural networks to identify the defects. These methods are generally very huge and have significant inference time. They also require GPU support to efficiently operate. All these issues make these models not fit for on-line prediction in the manufacturing foundry. In this paper, we propose an extremely simple yet effective technique to extract features from wafer images. The proposed method is extremely fast, intuitive, and non-parametric while being explainable. The experiment results show that the proposed pipeline outperforms conventional deep learning models. Our feature extraction requires no training or fine-tuning while preserving the relative shape and location of data points as revealed by our interpretability analysis.
We propose a data structure in $d$-dimensional hyperbolic space that can be considered a natural counterpart to quadtrees in Euclidean spaces. Based on this data structure we propose a so-called L-order for hyperbolic point sets, which is an extension of the Z-order defined in Euclidean spaces. Using these quadtrees and the L-order we build geometric spanners. Near-linear size $(1+\epsilon)$-spanners do not exist in hyperbolic spaces, but we are able to create a Steiner spanner that achieves a spanning ratio of $1+\epsilon$ with $\mathcal O_{d,\epsilon}(n)$ edges, using a simple construction that can be maintained dynamically. As a corollary we also get a $(2+\epsilon)$-spanner (in the classical sense) of the same size, where the spanning ratio $2+\epsilon$ is almost optimal among spanners of subquadratic size. Finally, we show that our Steiner spanner directly provides a solution to the approximate nearest neighbour problem: given a point set $P$ in $d$-dimensional hyperbolic space we build the data structure in $\mathcal O_{d,\epsilon}(n\log n)$ time, using $\mathcal O_{d,\epsilon}(n)$ space. Then for any query point $q$ we can find a point $p\in P$ that is at most $1+\epsilon$ times farther from $q$ than its nearest neighbour in $P$ in $\mathcal O_{d,\epsilon}(\log n)$ time. Moreover, the data structure is dynamic and can handle point insertions and deletions with update time $\mathcal O_{d,\epsilon}(\log n)$.
Interacting with the actual environment to acquire data is often costly and time-consuming in robotic tasks. Model-based offline reinforcement learning (RL) provides a feasible solution. On the one hand, it eliminates the requirements of interaction with the actual environment. On the other hand, it learns the transition dynamics and reward function from the offline datasets and generates simulated rollouts to accelerate training. Previous model-based offline RL methods adopt probabilistic ensemble neural networks (NN) to model aleatoric uncertainty and epistemic uncertainty. However, this results in an exponential increase in training time and computing resource requirements. Furthermore, these methods are easily disturbed by the accumulative errors of the environment dynamics models when simulating long-term rollouts. To solve the above problems, we propose an uncertainty-aware sequence modeling architecture called Environment Transformer. It models the probability distribution of the environment dynamics and reward function to capture aleatoric uncertainty and treats epistemic uncertainty as a learnable noise parameter. Benefiting from the accurate modeling of the transition dynamics and reward function, Environment Transformer can be combined with arbitrary planning, dynamics programming, or policy optimization algorithms for offline RL. In this case, we perform Conservative Q-Learning (CQL) to learn a conservative Q-function. Through simulation experiments, we demonstrate that our method achieves or exceeds state-of-the-art performance in widely studied offline RL benchmarks. Moreover, we show that Environment Transformer's simulated rollout quality, sample efficiency, and long-term rollout simulation capability are superior to those of previous model-based offline RL methods.
We consider dynamic pricing strategies in a streamed longitudinal data set-up where the objective is to maximize, over time, the cumulative profit across a large number of customer segments. We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time. Building on the well-known finding that consumers sharing similar characteristics act in similar ways, we consider a global shrinkage structure, which assumes that the consumers' preferences across the different segments can be well approximated by a spatial autoregressive (SAR) model. In such a streamed longitudinal set-up, we measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance. We propose a pricing policy based on penalized stochastic gradient descent (PSGD) and explicitly characterize its regret as functions of time, the temporal variability in the model parameters as well as the strength of the auto-correlation network structure spanning the varied customer segments. Our regret analysis results not only demonstrate asymptotic optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information as policies based on unshrunken models are highly sub-optimal in the aforementioned set-up. We conduct simulation experiments across a wide range of regimes as well as real-world networks based studies and report encouraging performance for our proposed method.
Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing and vacuum cleaning, to demining and search-and-rescue tasks. While offline methods can find provably complete, and in some cases optimal, paths for known environments, their value is limited in online scenarios where the environment is not known beforehand. In this case, the path needs to be planned online while mapping the environment. We investigate how suitable reinforcement learning is for this challenging problem, and analyze the involved components required to efficiently learn coverage paths, such as action space, input feature representation, neural network architecture, and reward function. Compared to existing classical methods, this approach allows for a flexible path space, and enables the agent to adapt to specific environment dynamics. In addition to local sensory inputs for acting on short-term obstacle detections, we propose to use egocentric maps in multiple scales based on frontiers. This allows the agent to plan a long-term path in large-scale environments with feasible computational and memory complexity. Furthermore, we propose a novel total variation reward term for guiding the agent not to leave small holes of non-covered free space. To validate the effectiveness of our approach, we perform extensive experiments in simulation with a 2D ranging sensor on different variations of the CPP problem, surpassing the performance of both previous RL-based approaches and highly specialized methods.
Anomaly detection is an important field that aims to identify unexpected patterns or data points, and it is closely related to many real-world problems, particularly to applications in finance, manufacturing, cyber security, and so on. While anomaly detection has been studied extensively in various fields, detecting future anomalies before they occur remains an unexplored territory. In this paper, we present a novel type of anomaly detection, called Precursor-of-Anomaly (PoA) detection. Unlike conventional anomaly detection, which focuses on determining whether a given time series observation is an anomaly or not, PoA detection aims to detect future anomalies before they happen. To solve both problems at the same time, we present a neural controlled differential equation-based neural network and its multi-task learning algorithm. We conduct experiments using 17 baselines and 3 datasets, including regular and irregular time series, and demonstrate that our presented method outperforms the baselines in almost all cases. Our ablation studies also indicate that the multitasking training method significantly enhances the overall performance for both anomaly and PoA detection.
In color spaces where the chromatic term is given in polar coordinates, the shortest distance between colors of the same value is circular. By converting such a space into a complex polar form with a real-valued value axis, a color algebra for combining colors is immediately available. In this work, we introduce two complex space operations utilizing this observation: circular average filtering and circular linear interpolation. These operations produce Archimedean Spirals, thus guaranteeing that they operate along the shortest paths. We demonstrate that these operations provide an intuitive way to work in certain color spaces and that they are particularly useful for obtaining better filtering and interpolation results. We present a set of examples based on the perceptually uniform color space CIELAB or L*a*b* with its polar form CIEHLC. We conclude that representing colors in a complex space with circular operations can provide better visual results by exploitation of the strong algebraic properties of complex space C.
Graphs are used widely to model complex systems, and detecting anomalies in a graph is an important task in the analysis of complex systems. Graph anomalies are patterns in a graph that do not conform to normal patterns expected of the attributes and/or structures of the graph. In recent years, graph neural networks (GNNs) have been studied extensively and have successfully performed difficult machine learning tasks in node classification, link prediction, and graph classification thanks to the highly expressive capability via message passing in effectively learning graph representations. To solve the graph anomaly detection problem, GNN-based methods leverage information about the graph attributes (or features) and/or structures to learn to score anomalies appropriately. In this survey, we review the recent advances made in detecting graph anomalies using GNN models. Specifically, we summarize GNN-based methods according to the graph type (i.e., static and dynamic), the anomaly type (i.e., node, edge, subgraph, and whole graph), and the network architecture (e.g., graph autoencoder, graph convolutional network). To the best of our knowledge, this survey is the first comprehensive review of graph anomaly detection methods based on GNNs.
Object detection is a fundamental task in computer vision and image processing. Current deep learning based object detectors have been highly successful with abundant labeled data. But in real life, it is not guaranteed that each object category has enough labeled samples for training. These large object detectors are easy to overfit when the training data is limited. Therefore, it is necessary to introduce few-shot learning and zero-shot learning into object detection, which can be named low-shot object detection together. Low-Shot Object Detection (LSOD) aims to detect objects from a few or even zero labeled data, which can be categorized into few-shot object detection (FSOD) and zero-shot object detection (ZSD), respectively. This paper conducts a comprehensive survey for deep learning based FSOD and ZSD. First, this survey classifies methods for FSOD and ZSD into different categories and discusses the pros and cons of them. Second, this survey reviews dataset settings and evaluation metrics for FSOD and ZSD, then analyzes the performance of different methods on these benchmarks. Finally, this survey discusses future challenges and promising directions for FSOD and ZSD.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.