亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

High-precision point cloud anomaly detection is the gold standard for identifying the defects of advancing machining and precision manufacturing. Despite some methodological advances in this area, the scarcity of datasets and the lack of a systematic benchmark hinder its development. We introduce Real3D-AD, a challenging high-precision point cloud anomaly detection dataset, addressing the limitations in the field. With 1,254 high-resolution 3D items from forty thousand to millions of points for each item, Real3D-AD is the largest dataset for high-precision 3D industrial anomaly detection to date. Real3D-AD surpasses existing 3D anomaly detection datasets available regarding point cloud resolution (0.0010mm-0.0015mm), 360 degree coverage and perfect prototype. Additionally, we present a comprehensive benchmark for Real3D-AD, revealing the absence of baseline methods for high-precision point cloud anomaly detection. To address this, we propose Reg3D-AD, a registration-based 3D anomaly detection method incorporating a novel feature memory bank that preserves local and global representations. Extensive experiments on the Real3D-AD dataset highlight the effectiveness of Reg3D-AD. For reproducibility and accessibility, we provide the Real3D-AD dataset, benchmark source code, and Reg3D-AD on our website://github.com/M-3LAB/Real3D-AD.

相關內容

在數(shu)(shu)(shu)(shu)據(ju)挖掘中,異(yi)(yi)常(chang)(chang)檢(jian)(jian)測(英語:anomaly detection)對(dui)不(bu)(bu)符合預(yu)(yu)期模(mo)式(shi)或數(shu)(shu)(shu)(shu)據(ju)集(ji)中其他項目(mu)的(de)(de)(de)項目(mu)、事件或觀測值的(de)(de)(de)識別。通常(chang)(chang)異(yi)(yi)常(chang)(chang)項目(mu)會轉變成銀(yin)行欺詐、結構缺陷、醫療問(wen)題、文本錯誤等類(lei)型的(de)(de)(de)問(wen)題。異(yi)(yi)常(chang)(chang)也被(bei)稱為(wei)離群值、新奇、噪聲、偏差(cha)和例(li)(li)外(wai)。 特別是(shi)(shi)(shi)在檢(jian)(jian)測濫(lan)用與網絡(luo)入侵時,有趣性(xing)對(dui)象(xiang)(xiang)往往不(bu)(bu)是(shi)(shi)(shi)罕(han)見(jian)對(dui)象(xiang)(xiang),但卻是(shi)(shi)(shi)超出(chu)(chu)預(yu)(yu)料的(de)(de)(de)突發活動(dong)。這(zhe)種模(mo)式(shi)不(bu)(bu)遵(zun)循通常(chang)(chang)統(tong)計定義中把(ba)異(yi)(yi)常(chang)(chang)點(dian)看作(zuo)是(shi)(shi)(shi)罕(han)見(jian)對(dui)象(xiang)(xiang),于是(shi)(shi)(shi)許多異(yi)(yi)常(chang)(chang)檢(jian)(jian)測方(fang)(fang)法(特別是(shi)(shi)(shi)無監(jian)督的(de)(de)(de)方(fang)(fang)法)將(jiang)對(dui)此類(lei)數(shu)(shu)(shu)(shu)據(ju)失(shi)效,除非進行了(le)合適的(de)(de)(de)聚集(ji)。相(xiang)反,聚類(lei)分(fen)析算法可(ke)能可(ke)以(yi)檢(jian)(jian)測出(chu)(chu)這(zhe)些模(mo)式(shi)形成的(de)(de)(de)微聚類(lei)。 有三大類(lei)異(yi)(yi)常(chang)(chang)檢(jian)(jian)測方(fang)(fang)法。[1] 在假設數(shu)(shu)(shu)(shu)據(ju)集(ji)中大多數(shu)(shu)(shu)(shu)實例(li)(li)都(dou)是(shi)(shi)(shi)正(zheng)(zheng)常(chang)(chang)的(de)(de)(de)前提下(xia),無監(jian)督異(yi)(yi)常(chang)(chang)檢(jian)(jian)測方(fang)(fang)法能通過尋找與其他數(shu)(shu)(shu)(shu)據(ju)最(zui)不(bu)(bu)匹(pi)配(pei)的(de)(de)(de)實例(li)(li)來檢(jian)(jian)測出(chu)(chu)未標記測試數(shu)(shu)(shu)(shu)據(ju)的(de)(de)(de)異(yi)(yi)常(chang)(chang)。監(jian)督式(shi)異(yi)(yi)常(chang)(chang)檢(jian)(jian)測方(fang)(fang)法需(xu)要一(yi)(yi)個已經被(bei)標記“正(zheng)(zheng)常(chang)(chang)”與“異(yi)(yi)常(chang)(chang)”的(de)(de)(de)數(shu)(shu)(shu)(shu)據(ju)集(ji),并涉及到訓(xun)練分(fen)類(lei)器(qi)(與許多其他的(de)(de)(de)統(tong)計分(fen)類(lei)問(wen)題的(de)(de)(de)關鍵區別是(shi)(shi)(shi)異(yi)(yi)常(chang)(chang)檢(jian)(jian)測的(de)(de)(de)內在不(bu)(bu)均衡(heng)性(xing))。半監(jian)督式(shi)異(yi)(yi)常(chang)(chang)檢(jian)(jian)測方(fang)(fang)法根據(ju)一(yi)(yi)個給定的(de)(de)(de)正(zheng)(zheng)常(chang)(chang)訓(xun)練數(shu)(shu)(shu)(shu)據(ju)集(ji)創建(jian)一(yi)(yi)個表示正(zheng)(zheng)常(chang)(chang)行為(wei)的(de)(de)(de)模(mo)型,然(ran)后檢(jian)(jian)測由學習模(mo)型生成的(de)(de)(de)測試實例(li)(li)的(de)(de)(de)可(ke)能性(xing)。

Recent studies have shown that attackers can catastrophically reduce the performance of GNNs by maliciously modifying the graph structure or node features on the graph. Adversarial training, which has been shown to be one of the most effective defense mechanisms against adversarial attacks in computer vision, holds great promise for enhancing the robustness of GNNs. There is limited research on defending against attacks by performing adversarial training on graphs, and it is crucial to delve deeper into this approach to optimize its effectiveness. Therefore, based on robust adversarial training on graphs, we propose a hierarchical constraint refinement framework (HC-Ref) that enhances the anti-perturbation capabilities of GNNs and downstream classifiers separately, ultimately leading to improved robustness. We propose corresponding adversarial regularization terms that are conducive to adaptively narrowing the domain gap between the normal part and the perturbation part according to the characteristics of different layers, promoting the smoothness of the predicted distribution of both parts. Moreover, existing research on graph robust adversarial training primarily concentrates on training from the standpoint of node feature perturbations and seldom takes into account alterations in the graph structure. This limitation makes it challenging to prevent attacks based on topological changes in the graph. This paper generates adversarial examples by utilizing graph structure perturbations, offering an effective approach to defend against attack methods that are based on topological changes. Extensive experiments on two real-world graph benchmarks show that HC-Ref successfully resists various attacks and has better node classification performance compared to several baseline methods.

While several long-form VideoQA datasets have been introduced, the length of both videos used to curate questions and sub-clips of clues leveraged to answer those questions have not yet reached the criteria for genuine long-form video understanding. Moreover, their QAs are unduly narrow and modality-biased, lacking a wider view of understanding long-term video content with rich dynamics and complex narratives. To remedy this, we introduce MoVQA, a long-form movie question-answering dataset, and benchmark to assess the diverse cognitive capabilities of multimodal systems rely on multi-level temporal lengths, with considering both video length and clue length. Additionally, to take a step towards human-level understanding in long-form video, versatile and multimodal question-answering is designed from the moviegoer-perspective to assess the model capabilities on various perceptual and cognitive axes.Through analysis involving various baselines reveals a consistent trend: the performance of all methods significantly deteriorate with increasing video and clue length. Meanwhile, our established baseline method has shown some improvements, but there is still ample scope for enhancement on our challenging MoVQA dataset. We expect our MoVQA to provide a new perspective and encourage inspiring works on long-form video understanding research.

Model-based control requires an accurate model of the system dynamics for precisely and safely controlling the robot in complex and dynamic environments. Moreover, in the presence of variations in the operating conditions, the model should be continuously refined to compensate for dynamics changes. In this paper, we present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. We combine offline learning from past experience and online learning from current robot interaction with the unknown environment. These two ingredients enable a highly sample-efficient and adaptive learning process, capable of accurately inferring model dynamics in real-time even in operating regimes that greatly differ from the training distribution. Moreover, we design an uncertainty-aware model predictive controller that is heuristically conditioned to the aleatoric (data) uncertainty of the learned dynamics. This controller actively chooses the optimal control actions that (i) optimize the control performance and (ii) improve the efficiency of online learning sample collection. We demonstrate the effectiveness of our method through a series of challenging real-world experiments using a quadrotor system. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions, while it significantly outperforms classical and adaptive control baselines.

We present LaMPilot, a novel framework for planning in the field of autonomous driving, rethinking the task as a code-generation process that leverages established behavioral primitives. This approach aims to address the challenge of interpreting and executing spontaneous user instructions such as "overtake the car ahead," which have typically posed difficulties for existing frameworks. We introduce the LaMPilot benchmark specifically designed to quantitatively evaluate the efficacy of Large Language Models (LLMs) in translating human directives into actionable driving policies. We then evaluate a wide range of state-of-the-art code generation language models on tasks from the LaMPilot Benchmark. The results of the experiments showed that GPT-4, with human feedback, achieved an impressive task completion rate of 92.7% and a minimal collision rate of 0.9%. To encourage further investigation in this area, our code and dataset will be made available.

Fully homomorphic encryption (FHE) is a promising cryptographic primitive for realizing private neural network inference (PI) services by allowing a client to fully offload the inference task to a cloud server while keeping the client data oblivious to the server. This work proposes NeuJeans, an FHE-based solution for the PI of deep convolutional neural networks (CNNs). NeuJeans tackles the critical problem of the enormous computational cost for the FHE evaluation of convolutional layers (conv2d), mainly due to the high cost of data reordering and bootstrapping. We first propose an encoding method introducing nested structures inside encoded vectors for FHE, which enables us to develop efficient conv2d algorithms with reduced data reordering costs. However, the new encoding method also introduces additional computations for conversion between encoding methods, which could negate its advantages. We discover that fusing conv2d with bootstrapping eliminates such computations while reducing the cost of bootstrapping. Then, we devise optimized execution flows for various types of conv2d and apply them to end-to-end implementation of CNNs. NeuJeans accelerates the performance of conv2d by up to 5.68 times compared to state-of-the-art FHE-based PI work and performs the PI of a CNN at the scale of ImageNet (ResNet18) within a mere few seconds

With the exponential surge in diverse multi-modal data, traditional uni-modal retrieval methods struggle to meet the needs of users demanding access to data from various modalities. To address this, cross-modal retrieval has emerged, enabling interaction across modalities, facilitating semantic matching, and leveraging complementarity and consistency between different modal data. Although prior literature undertook a review of the cross-modal retrieval field, it exhibits numerous deficiencies pertaining to timeliness, taxonomy, and comprehensiveness. This paper conducts a comprehensive review of cross-modal retrieval's evolution, spanning from shallow statistical analysis techniques to vision-language pre-training models. Commencing with a comprehensive taxonomy grounded in machine learning paradigms, mechanisms, and models, the paper then delves deeply into the principles and architectures underpinning existing cross-modal retrieval methods. Furthermore, it offers an overview of widely used benchmarks, metrics, and performances. Lastly, the paper probes the prospects and challenges that confront contemporary cross-modal retrieval, while engaging in a discourse on potential directions for further progress in the field. To facilitate the research on cross-modal retrieval, we develop an open-source code repository at //github.com/BMC-SDNU/Cross-Modal-Retrieval.

Knowledge graph embedding models learn the representations of entities and relations in the knowledge graphs for predicting missing links (relations) between entities. Their effectiveness are deeply affected by the ability of modeling and inferring different relation patterns such as symmetry, asymmetry, inversion, composition and transitivity. Although existing models are already able to model many of these relations patterns, transitivity, a very common relation pattern, is still not been fully supported. In this paper, we first theoretically show that the transitive relations can be modeled with projections. We then propose the Rot-Pro model which combines the projection and relational rotation together. We prove that Rot-Pro can infer all the above relation patterns. Experimental results show that the proposed Rot-Pro model effectively learns the transitivity pattern and achieves the state-of-the-art results on the link prediction task in the datasets containing transitive relations.

We study the problem of efficient semantic segmentation for large-scale 3D point clouds. By relying on expensive sampling techniques or computationally heavy pre/post-processing steps, most existing approaches are only able to be trained and operate over small-scale point clouds. In this paper, we introduce RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Although remarkably computation and memory efficient, random sampling can discard key features by chance. To overcome this, we introduce a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details. Extensive experiments show that our RandLA-Net can process 1 million points in a single pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net clearly surpasses state-of-the-art approaches for semantic segmentation on two large-scale benchmarks Semantic3D and SemanticKITTI.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

Recent years have witnessed the enormous success of low-dimensional vector space representations of knowledge graphs to predict missing facts or find erroneous ones. Currently, however, it is not yet well-understood how ontological knowledge, e.g. given as a set of (existential) rules, can be embedded in a principled way. To address this shortcoming, in this paper we introduce a framework based on convex regions, which can faithfully incorporate ontological knowledge into the vector space embedding. Our technical contribution is two-fold. First, we show that some of the most popular existing embedding approaches are not capable of modelling even very simple types of rules. Second, we show that our framework can represent ontologies that are expressed using so-called quasi-chained existential rules in an exact way, such that any set of facts which is induced using that vector space embedding is logically consistent and deductively closed with respect to the input ontology.

北京阿比特科技有限公司