亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The design of a tiny machine learning model, which can be deployed in mobile and edge devices, for point cloud object classification is investigated in this work. To achieve this objective, we replace the multi-scale representation of a point cloud object with a single-scale representation for complexity reduction, and exploit rich 3D geometric information of a point cloud object for performance improvement. The proposed solution is named Green-PointHop due to its low computational complexity. We evaluate the performance of Green-PointHop on ModelNet40 and ScanObjectNN two datasets. Green-PointHop has a model size of 64K parameters. It demands 2.3M floating-point operations (FLOPs) to classify a ModelNet40 object of 1024 down-sampled points. Its classification performance gaps against the state-of-the-art DGCNN method are 3% and 7% for ModelNet40 and ScanObjectNN, respectively. On the other hand, the model size and inference complexity of DGCNN are 42X and 1203X of those of Green-PointHop, respectively.

相關內容

Lifelong graph learning deals with the problem of continually adapting graph neural network (GNN) models to changes in evolving graphs. We address two critical challenges of lifelong graph learning in this work: dealing with new classes and tackling imbalanced class distributions. The combination of these two challenges is particularly relevant since newly emerging classes typically resemble only a tiny fraction of the data, adding to the already skewed class distribution. We make several contributions: First, we show that the amount of unlabeled data does not influence the results, which is an essential prerequisite for lifelong learning on a sequence of tasks. Second, we experiment with different label rates and show that our methods can perform well with only a tiny fraction of annotated nodes. Third, we propose the gDOC method to detect new classes under the constraint of having an imbalanced class distribution. The critical ingredient is a weighted binary cross-entropy loss function to account for the class imbalance. Moreover, we demonstrate combinations of gDOC with various base GNN models such as GraphSAGE, Simplified Graph Convolution, and Graph Attention Networks. Lastly, our k-neighborhood time difference measure provably normalizes the temporal changes across different graph datasets. With extensive experimentation, we find that the proposed gDOC method is consistently better than a naive adaption of DOC to graphs. Specifically, in experiments using the smallest history size, the out-of-distribution detection score of gDOC is 0.09 compared to 0.01 for DOC. Furthermore, gDOC achieves an Open-F1 score, a combined measure of in-distribution classification and out-of-distribution detection, of 0.33 compared to 0.25 of DOC (32% increase).

In order to deal with the sparse and unstructured raw point clouds, LiDAR based 3D object detection research mostly focuses on designing dedicated local point aggregators for fine-grained geometrical modeling. In this paper, we revisit the local point aggregators from the perspective of allocating computational resources. We find that the simplest pillar based models perform surprisingly well considering both accuracy and latency. Additionally, we show that minimal adaptions from the success of 2D object detection, such as enlarging receptive field, significantly boost the performance. Extensive experiments reveal that our pillar based networks with modernized designs in terms of architecture and training render the state-of-the-art performance on the two popular benchmarks: Waymo Open Dataset and nuScenes. Our results challenge the common intuition that the detailed geometry modeling is essential to achieve high performance for 3D object detection.

Infrastructure managers must maintain high standards to ensure user satisfaction during the lifecycle of infrastructures. Surveillance cameras and visual inspections have enabled progress in automating the detection of anomalous features and assessing the occurrence of deterioration. However, collecting damage data is typically time consuming and requires repeated inspections. The one-class damage detection approach has an advantage in that normal images can be used to optimize model parameters. Additionally, visual evaluation of heatmaps enables us to understand localized anomalous features. The authors highlight damage vision applications utilized in the robust property and localized damage explainability. First, we propose a civil-purpose application for automating one-class damage detection reproducing a fully convolutional data description (FCDD) as a baseline model. We have obtained accurate and explainable results demonstrating experimental studies on concrete damage and steel corrosion in civil engineering. Additionally, to develop a more robust application, we applied our method to another outdoor domain that contains complex and noisy backgrounds using natural disaster datasets collected using various devices. Furthermore, we propose a valuable solution of deeper FCDDs focusing on other powerful backbones to improve the performance of damage detection and implement ablation studies on disaster datasets. The key results indicate that the deeper FCDDs outperformed the baseline FCDD on datasets representing natural disaster damage caused by hurricanes, typhoons, earthquakes, and four-event disasters.

Efficient inference for object detection networks is a major challenge on edge devices. Post-Training Quantization (PTQ), which transforms a full-precision model into low bit-width directly, is an effective and convenient approach to reduce model inference complexity. But it suffers severe accuracy drop when applied to complex tasks such as object detection. PTQ optimizes the quantization parameters by different metrics to minimize the perturbation of quantization. The p-norm distance of feature maps before and after quantization, Lp, is widely used as the metric to evaluate perturbation. For the specialty of object detection network, we observe that the parameter p in Lp metric will significantly influence its quantization performance. We indicate that using a fixed hyper-parameter p does not achieve optimal quantization performance. To mitigate this problem, we propose a framework, DetPTQ, to assign different p values for quantizing different layers using an Object Detection Output Loss (ODOL), which represents the task loss of object detection. DetPTQ employs the ODOL-based adaptive Lp metric to select the optimal quantization parameters. Experiments show that our DetPTQ outperforms the state-of-the-art PTQ methods by a significant margin on both 2D and 3D object detectors. For example, we achieve 31.1/31.7(quantization/full-precision) mAP on RetinaNet-ResNet18 with 4-bit weight and 4-bit activation.

Graph Transformer is gaining increasing attention in the field of machine learning and has demonstrated state-of-the-art performance on benchmarks for graph representation learning. However, as current implementations of Graph Transformer primarily focus on learning representations of small-scale graphs, the quadratic complexity of the global self-attention mechanism presents a challenge for full-batch training when applied to larger graphs. Additionally, conventional sampling-based methods fail to capture necessary high-level contextual information, resulting in a significant loss of performance. In this paper, we introduce the Hierarchical Scalable Graph Transformer (HSGT) as a solution to these challenges. HSGT successfully scales the Transformer architecture to node representation learning tasks on large-scale graphs, while maintaining high performance. By utilizing graph hierarchies constructed through coarsening techniques, HSGT efficiently updates and stores multi-scale information in node embeddings at different levels. Together with sampling-based training methods, HSGT effectively captures and aggregates multi-level information on the hierarchical graph using only Transformer blocks. Empirical evaluations demonstrate that HSGT achieves state-of-the-art performance on large-scale benchmarks with graphs containing millions of nodes with high efficiency.

The rapid advancements in machine learning, graphics processing technologies and availability of medical imaging data has led to a rapid increase in use of machine learning models in the medical domain. This was exacerbated by the rapid advancements in convolutional neural network (CNN) based architectures, which were adopted by the medical imaging community to assist clinicians in disease diagnosis. Since the grand success of AlexNet in 2012, CNNs have been increasingly used in medical image analysis to improve the efficiency of human clinicians. In recent years, three-dimensional (3D) CNNs have been employed for analysis of medical images. In this paper, we trace the history of how the 3D CNN was developed from its machine learning roots, brief mathematical description of 3D CNN and the preprocessing steps required for medical images before feeding them to 3D CNNs. We review the significant research in the field of 3D medical imaging analysis using 3D CNNs (and its variants) in different medical areas such as classification, segmentation, detection, and localization. We conclude by discussing the challenges associated with the use of 3D CNNs in the medical imaging domain (and the use of deep learning models, in general) and possible future trends in the field.

In Multi-Label Text Classification (MLTC), one sample can belong to more than one class. It is observed that most MLTC tasks, there are dependencies or correlations among labels. Existing methods tend to ignore the relationship among labels. In this paper, a graph attention network-based model is proposed to capture the attentive dependency structure among the labels. The graph attention network uses a feature matrix and a correlation matrix to capture and explore the crucial dependencies between the labels and generate classifiers for the task. The generated classifiers are applied to sentence feature vectors obtained from the text feature extraction network (BiLSTM) to enable end-to-end training. Attention allows the system to assign different weights to neighbor nodes per label, thus allowing it to learn the dependencies among labels implicitly. The results of the proposed model are validated on five real-world MLTC datasets. The proposed model achieves similar or better performance compared to the previous state-of-the-art models.

Graph classification aims to perform accurate information extraction and classification over graphstructured data. In the past few years, Graph Neural Networks (GNNs) have achieved satisfactory performance on graph classification tasks. However, most GNNs based methods focus on designing graph convolutional operations and graph pooling operations, overlooking that collecting or labeling graph-structured data is more difficult than grid-based data. We utilize meta-learning for fewshot graph classification to alleviate the scarce of labeled graph samples when training new tasks.More specifically, to boost the learning of graph classification tasks, we leverage GNNs as graph embedding backbone and meta-learning as training paradigm to capture task-specific knowledge rapidly in graph classification tasks and transfer them to new tasks. To enhance the robustness of meta-learner, we designed a novel step controller driven by Reinforcement Learning. The experiments demonstrate that our framework works well compared to baselines.

While deep learning strategies achieve outstanding results in computer vision tasks, one issue remains. The current strategies rely heavily on a huge amount of labeled data. In many real-world problems it is not feasible to create such an amount of labeled training data. Therefore, researchers try to incorporate unlabeled data into the training process to reach equal results with fewer labels. Due to a lot of concurrent research, it is difficult to keep track of recent developments. In this survey we provide an overview of often used techniques and methods in image classification with fewer labels. We compare 21 methods. In our analysis we identify three major trends. 1. State-of-the-art methods are scaleable to real world applications based on their accuracy. 2. The degree of supervision which is needed to achieve comparable results to the usage of all labels is decreasing. 3. All methods share common techniques while only few methods combine these techniques to achieve better performance. Based on all of these three trends we discover future research opportunities.

Advanced methods of applying deep learning to structured data such as graphs have been proposed in recent years. In particular, studies have focused on generalizing convolutional neural networks to graph data, which includes redefining the convolution and the downsampling (pooling) operations for graphs. The method of generalizing the convolution operation to graphs has been proven to improve performance and is widely used. However, the method of applying downsampling to graphs is still difficult to perform and has room for improvement. In this paper, we propose a graph pooling method based on self-attention. Self-attention using graph convolution allows our pooling method to consider both node features and graph topology. To ensure a fair comparison, the same training procedures and model architectures were used for the existing pooling methods and our method. The experimental results demonstrate that our method achieves superior graph classification performance on the benchmark datasets using a reasonable number of parameters.

北京阿比特科技有限公司