基于純視覺的三維目標檢測是現階段自動駕駛系統中的重要感知技術,為下游模塊提供周圍環境中物體的位置和類別信息,在學術界和工業界都受到廣泛的關注。該方法大幅提高了檢測性能,有希望推動純視覺方法的應用落地。
該項研究提出了一種偽雙目三維目標檢測框架,并通過兩種方式生成偽雙目視角完成三維目標檢測任務:(1)在圖像層面利用左圖和預估視差圖生成虛擬右圖;(2)在特征層面利用左圖特征和預估視差特征生成虛擬右特征,與圖像層面相比,該方式顯著提高了生成速度**(0.0017s V.S. 1.8454s)**。在CVPR 2022截至投稿前,該方法在廣泛使用的KITTI自動駕駛單目圖像三維目標檢測公開排行榜上在所有類別的檢測性能上均排名第一。
//www.zhuanzhi.ai/paper/3e54d98cd3799503389c0876bae65b11
The detection of 3D objects through a single perspective camera is a challenging issue. The anchor-free and keypoint-based models receive increasing attention recently due to their effectiveness and simplicity. However, most of these methods are vulnerable to occluded and truncated objects. In this paper, a single-stage monocular 3D object detection model is proposed. An instance-segmentation head is integrated into the model training, which allows the model to be aware of the visible shape of a target object. The detection largely avoids interference from irrelevant regions surrounding the target objects. In addition, we also reveal that the popular IoU-based evaluation metrics, which were originally designed for evaluating stereo or LiDAR-based detection methods, are insensitive to the improvement of monocular 3D object detection algorithms. A novel evaluation metric, namely average depth similarity (ADS) is proposed for the monocular 3D object detection models. Our method outperforms the baseline on both the popular and the proposed evaluation metrics while maintaining real-time efficiency.
Nowadays, there are outstanding strides towards a future with autonomous vehicles on our roads. While the perception of autonomous vehicles performs well under closed-set conditions, they still struggle to handle the unexpected. This survey provides an extensive overview of anomaly detection techniques based on camera, lidar, radar, multimodal and abstract object level data. We provide a systematization including detection approach, corner case level, ability for an online application, and further attributes. We outline the state-of-the-art and point out current research gaps.
三維目標檢測是自動駕駛等應用中的核心技術之一。雖然基于點云的三維目標檢測已取得了顯著的進展,仍不能滿足眾多實際需求。近年來,研究者們開始關注基于多模態融合的三維目標檢測方法,其不但能提升三維目標檢測的精準性,且能減少對激光雷達成像質量的依賴。在本次報告中,我將首先介紹三維目標檢測的基礎知識和基本方法;接著,重點介紹多模態融合的三維目標檢測研究現狀;最后,我將分享基于多模態融合三維目標檢測的實際應用情況。
本文將單目3D目標檢測任務分解為四個子任務,包括2D目標檢測,實例級深度估計,投影3D中心估計和局部角點回歸。
在真實的3D空間中檢測和定位對象(在場景理解中起著至關重要的作用)尤其困難,因為在圖像投影過程中由于幾何信息的丟失,僅給出單目圖像。我們提出MonoGRNet用于通過幾何推理在觀測到的2D投影和未觀測到的深度尺寸中從單目圖像中檢測無模態3D對象。 MonoGRNet將單目3D目標檢測任務分解為四個子任務,包括2D目標檢測,實例級深度估計,投影3D中心估計和局部角點回歸。任務分解極大地促進了單目3D對象檢測,從而可以在單個前向傳遞中有效地預測目標3D邊界框,而無需使用object proposal,后處理或先前方法所使用的計算上昂貴的像素級深度估計。此外,MonoGRNet可以靈活地適應完全和弱監督學習,從而提高了我們框架在各種環境中的可行性。在KITTI,Cityscapes和MS COCO數據集上進行了實驗。結果表明,我們的框架在各種情況下均具有令人鼓舞的性能。
主題: Exploring Categorical Regularization for Domain Adaptive Object Detection
摘要: 在本文中,我們解決了域自適應對象檢測問題,其中主要挑戰在于源域和目標域之間的顯著域間隙。先前的工作試圖使圖像級別和實例級別的轉換明確對齊,以最終將域差異最小化。但是,它們仍然忽略了跨域匹配關鍵圖像區域和重要實例,這將嚴重影響域偏移緩解。在這項工作中,我們提出了一個簡單但有效的分類正則化框架來緩解此問題。它可以作為即插即用組件應用于一系列領域自適應快速R-CNN方法,這些方法在處理領域自適應檢測方面非常重要。具體地,通過將??圖像級多標簽分類器集成到檢測主干上,由于分類方式的定位能力較弱,我們可以獲得與分類信息相對應的稀疏但至關重要的圖像區域。同時,在實例級別,我們利用圖像級別預測(通過分類器)和實例級別預測(通過檢測頭)之間的分類一致性作為規則化因子,以自動尋找目標域的硬對齊實例。各種域移位方案的大量實驗表明,與原始的域自適應快速R-CNN檢測器相比,我們的方法獲得了顯著的性能提升。此外,定性的可視化和分析可以證明我們的方法參加針對領域適應的關鍵區域/實例的能力。
We present a monocular Simultaneous Localization and Mapping (SLAM) using high level object and plane landmarks, in addition to points. The resulting map is denser, more compact and meaningful compared to point only SLAM. We first propose a high order graphical model to jointly infer the 3D object and layout planes from single image considering occlusions and semantic constraints. The extracted cuboid object and layout planes are further optimized in a unified SLAM framework. Objects and planes can provide more semantic constraints such as Manhattan and object supporting relationships compared to points. Experiments on various public and collected datasets including ICL NUIM and TUM mono show that our algorithm can improve camera localization accuracy compared to state-of-the-art SLAM and also generate dense maps in many structured environments.