亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='IXGvH'><del id='H4DjM'><del id='d6AJm'></del><pre id='gtVMt'><pre id='rT1la'><option id='BybbX'><address id='YRr77'></address><bdo id='q8EoC'><tr id='9npFX'><acronym id='p3kjv'><pre id='GL5Q1'></pre></acronym><div id='Z8uvf'></div></tr></bdo></option></pre><small id='i3CvY'><address id='vNDM6'><u id='jYGju'><legend id='dDuxU'><option id='Jn3Pr'><abbr id='mXKgh'></abbr><li id='tNjiu'><pre id='gV4RB'></pre></li></option></legend><select id='MzHnm'></select></u></address></small></pre></del><sup id='ReUvX'></sup><blockquote id='vN7KG'><dt id='50XeL'></dt></blockquote><blockquote id='IcxFM'></blockquote></dir><tt id='FOyZo'></tt><u id='rKM7i'><tt id='jQOCJ'><form id='hcptW'></form></tt><td id='PQILz'><dt id='h9Zg4'></dt></td></u>

<code id='C50wi'><i id='NiIkf'><q id='NHBmi'><legend id='inpzD'><pre id='lUP1R'><style id='ae0Nm'><acronym id='cG8uG'><i id='B3mmp'><form id='LzhIf'><option id='3aHLi'><center id='ha1jM'></center></option></form></i></acronym></style><tt id='RpiOi'></tt></pre></legend></q></i></code><center id='Ji69d'></center>

<dd id='mIO58'></dd>

<style id='Rni1U'></style><sub id='KbdpA'><dfn id='vVK7D'><abbr id='DMXD1'><big id='ZTytX'><bdo id='PQvdj'></bdo></big></abbr></dfn></sub>_{<dir id='5PafL'></dir>}

·

3D · 目標檢測 · Attention · 稀疏 · 查全率/召回率 ·

2023 年 6 月 5 日

Object as Query: Lifting any 2D Object Detector to 3D Detection

Zitian Wang,Zehao Huang,Jiahui Fu,Naiyan Wang,Si Liu

from arxiv, technical report

3D object detection from multi-view images has drawn much attention over the past few years. Existing methods mainly establish 3D representations from multi-view images and adopt a dense detection head for object detection, or employ object queries distributed in 3D space to localize objects. In this paper, we design Multi-View 2D Objects guided 3D Object Detector (MV2D), which can lift any 2D object detector to multi-view 3D object detection. Since 2D detections can provide valuable priors for object existence, MV2D exploits 2D detectors to generate object queries conditioned on the rich image semantics. These dynamically generated queries help MV2D to recall objects in the field of view and show a strong capability of localizing 3D objects. For the generated queries, we design a sparse cross attention module to force them to focus on the features of specific objects, which suppresses interference from noises. The evaluation results on the nuScenes dataset demonstrate the dynamic object queries and sparse feature aggregation can promote 3D detection capability. MV2D also exhibits a state-of-the-art performance among existing methods. We hope MV2D can serve as a new baseline for future research.

相關內容

3D是英文“Three Dimensions”的簡(jian)稱，中(zhong)文是指(zhi)三(san)維(wei)、三(san)個(ge)(ge)維(wei)度(du)、三(san)個(ge)(ge)坐標，即有長、有寬、有高，換(huan)句話說，就是立體的，是相(xiang)對于只有長和寬的平面（2D）而言(yan)。

3D · 目標檢測 · 可約的 · 傳感器 · MoDELS ·

2023 年 7 月 27 日

RCM-Fusion: Radar-Camera Multi-Level Fusion for 3D Object Detection

Jisong Kim,Minjae Seong,Geonho Bang,Dongsuk Kum,Jun Won Choi

from arxiv, 10 pages, 5 figures

While LiDAR sensors have been succesfully applied to 3D object detection, the affordability of radar and camera sensors has led to a growing interest in fusiong radars and cameras for 3D object detection. However, previous radar-camera fusion models have not been able to fully utilize radar information in that initial 3D proposals were generated based on the camera features only and the instance-level fusion is subsequently conducted. In this paper, we propose radar-camera multi-level fusion (RCM-Fusion), which fuses radar and camera modalities at both the feature-level and instance-level to fully utilize radar information. At the feature-level, we propose a Radar Guided BEV Encoder which utilizes radar Bird's-Eye-View (BEV) features to transform image features into precise BEV representations and then adaptively combines the radar and camera BEV features. At the instance-level, we propose a Radar Grid Point Refinement module that reduces localization error by considering the characteristics of the radar point clouds. The experiments conducted on the public nuScenes dataset demonstrate that our proposed RCM-Fusion offers 11.8% performance gain in nuScenes detection score (NDS) over the camera-only baseline model and achieves state-of-the-art performaces among radar-camera fusion methods in the nuScenes 3D object detection benchmark. Code will be made publicly available.

稀疏 · 3D · MoDELS · NeRF · 控制器 ·

2023 年 7 月 26 日

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation

Chaohui Yu,Qiang Zhou,Jingliang Li,Zhe Zhang,Zhibin Wang,Fan Wang

from arxiv, Accepted by ACMMM 2023

Text-to-3D generation has recently garnered significant attention, fueled by 2D diffusion models trained on billions of image-text pairs. Existing methods primarily rely on score distillation to leverage the 2D diffusion priors to supervise the generation of 3D models, e.g., NeRF. However, score distillation is prone to suffer the view inconsistency problem, and implicit NeRF modeling can also lead to an arbitrary shape, thus leading to less realistic and uncontrollable 3D generation. In this work, we propose a flexible framework of Points-to-3D to bridge the gap between sparse yet freely available 3D points and realistic shape-controllable 3D generation by distilling the knowledge from both 2D and 3D diffusion models. The core idea of Points-to-3D is to introduce controllable sparse 3D points to guide the text-to-3D generation. Specifically, we use the sparse point cloud generated from the 3D diffusion model, Point-E, as the geometric prior, conditioned on a single reference image. To better utilize the sparse 3D points, we propose an efficient point cloud guidance loss to adaptively drive the NeRF's geometry to align with the shape of the sparse 3D points. In addition to controlling the geometry, we propose to optimize the NeRF for a more view-consistent appearance. To be specific, we perform score distillation to the publicly available 2D image diffusion model ControlNet, conditioned on text as well as depth map of the learned compact geometry. Qualitative and quantitative comparisons demonstrate that Points-to-3D improves view consistency and achieves good shape controllability for text-to-3D generation. Points-to-3D provides users with a new way to improve and control text-to-3D generation.

MoDELS · 目標檢測 · LIDAR · Performer · 3D ·

2023 年 7 月 25 日

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

Yiming Wu,Ruixiang Li,Zequn Qin,Xinhai Zhao,Xi Li

Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths. Theoretically, we give proof of the equivalence between height-based methods and depth-based methods. Considering the equivalence and some advantages of modeling heights, we propose HeightFormer, which models heights and uncertainties in a self-recursive way. Without any extra data, the proposed HeightFormer could estimate heights in BEV accurately. Benchmark results show that the performance of HeightFormer achieves SOTA compared with those camera-only methods.

EDTER · 變換 · 邊 · Extensibility · INFORMS ·

2022 年 3 月 16 日

EDTER: Edge Detection with Transformer

Mengyang Pu,Yaping Huang,Yuming Liu,Qingji Guan,Haibin Ling

from arxiv, Accepted by CVPR2022

Convolutional neural networks have made significant progresses in edge detection by progressively exploring the context and semantic features. However, local details are gradually suppressed with the enlarging of receptive fields. Recently, vision transformer has shown excellent capability in capturing long-range dependencies. Inspired by this, we propose a novel transformer-based edge detector, \emph{Edge Detection TransformER (EDTER)}, to extract clear and crisp object boundaries and meaningful edges by exploiting the full image context information and detailed local cues simultaneously. EDTER works in two stages. In Stage I, a global transformer encoder is used to capture long-range global context on coarse-grained image patches. Then in Stage II, a local transformer encoder works on fine-grained patches to excavate the short-range local cues. Each transformer encoder is followed by an elaborately designed Bi-directional Multi-Level Aggregation decoder to achieve high-resolution features. Finally, the global context and local cues are combined by a Feature Fusion Module and fed into a decision head for edge prediction. Extensive experiments on BSDS500, NYUDv2, and Multicue demonstrate the superiority of EDTER in comparison with state-of-the-arts.

目標檢測 · 學成 · Performer · state-of-the-art · 深度學習 ·

2021 年 10 月 25 日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Xin Wu,Wei Li,Danfeng Hong,Ran Tao,Qian Du

Owing to effective and flexible data acquisition, unmanned aerial vehicle (UAV) has recently become a hotspot across the fields of computer vision (CV) and remote sensing (RS). Inspired by recent success of deep learning (DL), many advanced object detection and tracking approaches have been widely applied to various UAV-related tasks, such as environmental monitoring, precision agriculture, traffic management. This paper provides a comprehensive survey on the research progress and prospects of DL-based UAV object detection and tracking methods. More specifically, we first outline the challenges, statistics of existing methods, and provide solutions from the perspectives of DL-based models in three research topics: object detection from the image, object detection from the video, and object tracking from the video. Open datasets related to UAV-dominated object detection and tracking are exhausted, and four benchmark datasets are employed for performance evaluation using some state-of-the-art methods. Finally, prospects and considerations for the future work are discussed and summarized. It is expected that this survey can facilitate those researchers who come from remote sensing field with an overview of DL-based UAV object detection and tracking methods, along with some thoughts on their further developments.

Extensibility · 學成 · 噪聲分布 · Networking · 表征學習 ·

2021 年 7 月 25 日

Image Manipulation Detection by Multi-View Multi-Scale Supervision

Xinru Chen,Chengbo Dong,Jiaqi Ji,Juan Cao,Xirong Li

from arxiv, Accepted by ICCV 2021

The key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data, whilst specific to prevent false alarms on authentic images. Current research emphasizes the sensitivity, with the specificity overlooked. In this paper we address both aspects by multi-view feature learning and multi-scale supervision. By exploiting noise distribution and boundary artifact surrounding tampered regions, the former aims to learn semantic-agnostic and thus more generalizable features. The latter allows us to learn from authentic images which are nontrivial to be taken into account by current semantic segmentation network based methods. Our thoughts are realized by a new network which we term MVSS-Net. Extensive experiments on five benchmark sets justify the viability of MVSS-Net for both pixel-level and image-level manipulation detection.

目標檢測 · 點云 · 3D · 可辨認的 · INFORMS ·

2021 年 6 月 21 日

3D Object Detection for Autonomous Driving: A Survey

Rui Qian,Xin Lai,Xirong Li

from arxiv, 3D object detection, Autonomous driving, Point clouds

Autonomous driving is regarded as one of the most promising remedies to shield human beings from severe crashes. To this end, 3D object detection serves as the core basis of such perception system especially for the sake of path planning, motion prediction, collision avoidance, etc. Generally, stereo or monocular images with corresponding 3D point clouds are already standard layout for 3D object detection, out of which point clouds are increasingly prevalent with accurate depth information being provided. Despite existing efforts, 3D object detection on point clouds is still in its infancy due to high sparseness and irregularity of point clouds by nature, misalignment view between camera view and LiDAR bird's eye of view for modality synergies, occlusions and scale variations at long distances, etc. Recently, profound progress has been made in 3D object detection, with a large body of literature being investigated to address this vision task. As such, we present a comprehensive review of the latest progress in this field covering all the main topics including sensors, fundamentals, and the recent state-of-the-art detection methods with their pros and cons. Furthermore, we introduce metrics and provide quantitative comparisons on popular public datasets. The avenues for future work are going to be judiciously identified after an in-deep analysis of the surveyed works. Finally, we conclude this paper.

稀疏 · Performer · Siamese · 孿生網絡 · Branch ·

2020 年 12 月 3 日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Tiancai Wang,Tong Yang,Jiale Cao,Xiangyu Zhang

from arxiv, Accepted to AAAI 2021

Object detectors usually achieve promising results with the supervision of complete instance annotations. However, their performance is far from satisfactory with sparse instance annotations. Most existing methods for sparsely annotated object detection either re-weight the loss of hard negative samples or convert the unlabeled instances into ignored regions to reduce the interference of false negatives. We argue that these strategies are insufficient since they can at most alleviate the negative effect caused by missing annotations. In this paper, we propose a simple but effective mechanism, called Co-mining, for sparsely annotated object detection. In our Co-mining, two branches of a Siamese network predict the pseudo-label sets for each other. To enhance multi-view learning and better mine unlabeled instances, the original image and corresponding augmented image are used as the inputs of two branches of the Siamese network, respectively. Co-mining can serve as a general training mechanism applied to most of modern object detectors. Experiments are performed on MS COCO dataset with three different sparsely annotated settings using two typical frameworks: anchor-based detector RetinaNet and anchor-free detector FCOS. Experimental results show that our Co-mining with RetinaNet achieves 1.4%~2.1% improvements compared with different baselines and surpasses existing methods under the same sparsely annotated setting.

Backbone · 3D · 目標檢測 · Networking · 點云 ·

2019 年 1 月 24 日

3D Backbone Network for 3D Object Detection

Xuesong Li,Jose E Guivant,Ngaiming Kwok,Yongzhi Xu

The task of detecting 3D objects in point cloud has a pivotal role in many real-world applications. However, 3D object detection performance is behind that of 2D object detection due to the lack of powerful 3D feature extraction methods. In order to address this issue, we propose to build a 3D backbone network to learn rich 3D feature maps by using sparse 3D CNN operations for 3D object detection in point cloud. The 3D backbone network can inherently learn 3D features from almost raw data without compressing point cloud into multiple 2D images and generate rich feature maps for object detection. The sparse 3D CNN takes full advantages of the sparsity in the 3D point cloud to accelerate computation and save memory, which makes the 3D backbone network achievable. Empirical experiments are conducted on the KITTI benchmark and results show that the proposed method can achieve state-of-the-art performance for 3D object detection.

目標檢測 · Vision · 地球 · 數據集 · state-of-the-art ·

2018 年 1 月 27 日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Gui-Song Xia,Xiang Bai,Jian Ding,Zhen Zhu,Serge Belongie,Jiebo Luo,Mihai Datcu,Marcello Pelillo,Liangpei Zhang

Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of the huge variation in the scale, orientation and shape of the object instances on the earth's surface, but also due to the scarcity of well-annotated datasets of objects in aerial scenes. To advance object detection research in Earth Vision, also known as Earth Observation and Remote Sensing, we introduce a large-scale Dataset for Object deTection in Aerial images (DOTA). To this end, we collect $2806$ aerial images from different sensors and platforms. Each image is of the size about 4000-by-4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using $15$ common object categories. The fully annotated DOTA images contains $188,282$ instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral To build a baseline for object detection in Earth Vision, we evaluate state-of-the-art object detection algorithms on DOTA. Experiments demonstrate that DOTA well represents real Earth Vision applications and are quite challenging.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

查全率/召回率

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='6qxe4'></tfoot>

<legend id='6qxe4'><style id='6qxe4'><dir id='6qxe4'><q id='6qxe4'></q></dir></style></legend>

<i id='6qxe4'><tr id='6qxe4'><dt id='6qxe4'><q id='6qxe4'><span id='6qxe4'><b id='6qxe4'><form id='6qxe4'><ins id='6qxe4'></ins><ul id='6qxe4'></ul><sub id='6qxe4'></sub></form><legend id='6qxe4'></legend><bdo id='6qxe4'><pre id='6qxe4'><center id='6qxe4'></center></pre></bdo></b><th id='6qxe4'></th></span></q></dt></tr></i><div id='6qxe4'><tfoot id='6qxe4'></tfoot><dl id='6qxe4'><fieldset id='6qxe4'></fieldset></dl></div>