亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Salient object detection is a fundamental problem and has been received a great deal of attentions in computer vision. Recently deep learning model became a powerful tool for image feature extraction. In this paper, we propose a multi-scale deep neural network (MSDNN) for salient object detection. The proposed model first extracts global high-level features and context information over the whole source image with recurrent convolutional neural network (RCNN). Then several stacked deconvolutional layers are adopted to get the multi-scale feature representation and obtain a series of saliency maps. Finally, we investigate a fusion convolution module (FCM) to build a final pixel level saliency map. The proposed model is extensively evaluated on four salient object detection benchmark datasets. Results show that our deep model significantly outperforms other 12 state-of-the-art approaches.

相關內容

神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)(Neural Networks)是世(shi)界(jie)上三個(ge)最古老的(de)(de)(de)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)建(jian)(jian)模(mo)學(xue)(xue)(xue)會(hui)的(de)(de)(de)檔案(an)期刊:國(guo)際(ji)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)學(xue)(xue)(xue)會(hui)(INNS)、歐洲神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)學(xue)(xue)(xue)會(hui)(ENNS)和(he)日本(ben)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)學(xue)(xue)(xue)會(hui)(JNNS)。神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)提供了(le)一(yi)個(ge)論壇,以(yi)發展和(he)培育一(yi)個(ge)國(guo)際(ji)社(she)會(hui)的(de)(de)(de)學(xue)(xue)(xue)者和(he)實踐者感(gan)興(xing)趣的(de)(de)(de)所(suo)有方面的(de)(de)(de)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)和(he)相關方法的(de)(de)(de)計(ji)算(suan)(suan)智能(neng)。神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)歡迎高質量(liang)(liang)論文的(de)(de)(de)提交,有助于全面的(de)(de)(de)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)研(yan)究,從行(xing)為和(he)大腦建(jian)(jian)模(mo),學(xue)(xue)(xue)習(xi)算(suan)(suan)法,通過(guo)數(shu)(shu)學(xue)(xue)(xue)和(he)計(ji)算(suan)(suan)分析,系統(tong)的(de)(de)(de)工程和(he)技(ji)術(shu)應用(yong),大量(liang)(liang)使用(yong)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)的(de)(de)(de)概念和(he)技(ji)術(shu)。這一(yi)獨特而廣泛的(de)(de)(de)范圍促進了(le)生物(wu)(wu)(wu)和(he)技(ji)術(shu)研(yan)究之間的(de)(de)(de)思想交流,并有助于促進對生物(wu)(wu)(wu)啟發的(de)(de)(de)計(ji)算(suan)(suan)智能(neng)感(gan)興(xing)趣的(de)(de)(de)跨學(xue)(xue)(xue)科(ke)社(she)區的(de)(de)(de)發展。因此,神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)絡(luo)(luo)編委會(hui)代表的(de)(de)(de)專家領域包(bao)括(kuo)心理學(xue)(xue)(xue),神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)生物(wu)(wu)(wu)學(xue)(xue)(xue),計(ji)算(suan)(suan)機科(ke)學(xue)(xue)(xue),工程,數(shu)(shu)學(xue)(xue)(xue),物(wu)(wu)(wu)理。該雜志發表文章(zhang)、信(xin)(xin)件和(he)評論以(yi)及(ji)給編輯的(de)(de)(de)信(xin)(xin)件、社(she)論、時事(shi)、軟件調查(cha)和(he)專利信(xin)(xin)息。文章(zhang)發表在五個(ge)部分之一(yi):認知(zhi)科(ke)學(xue)(xue)(xue),神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)科(ke)學(xue)(xue)(xue),學(xue)(xue)(xue)習(xi)系統(tong),數(shu)(shu)學(xue)(xue)(xue)和(he)計(ji)算(suan)(suan)分析、工程和(he)應用(yong)。 官網(wang)地(di)址:

Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).

Although YOLOv2 approach is extremely fast on object detection; its backbone network has the low ability on feature extraction and fails to make full use of multi-scale local region features, which restricts the improvement of object detection accuracy. Therefore, this paper proposed a DC-SPP-YOLO (Dense Connection and Spatial Pyramid Pooling Based YOLO) approach for ameliorating the object detection accuracy of YOLOv2. Specifically, the dense connection of convolution layers is employed in the backbone network of YOLOv2 to strengthen the feature extraction and alleviate the vanishing-gradient problem. Moreover, an improved spatial pyramid pooling is introduced to pool and concatenate the multi-scale local region features, so that the network can learn the object features more comprehensively. The DC-SPP-YOLO model is established and trained based on a new loss function composed of mean square error and cross entropy, and the object detection is realized. Experiments demonstrate that the mAP (mean Average Precision) of DC-SPP-YOLO proposed on PASCAL VOC datasets and UA-DETRAC datasets is higher than that of YOLOv2; the object detection accuracy of DC-SPP-YOLO is superior to YOLOv2 by strengthening feature extraction and using the multi-scale local region features.

The task of detecting 3D objects in point cloud has a pivotal role in many real-world applications. However, 3D object detection performance is behind that of 2D object detection due to the lack of powerful 3D feature extraction methods. In order to address this issue, we propose to build a 3D backbone network to learn rich 3D feature maps by using sparse 3D CNN operations for 3D object detection in point cloud. The 3D backbone network can inherently learn 3D features from almost raw data without compressing point cloud into multiple 2D images and generate rich feature maps for object detection. The sparse 3D CNN takes full advantages of the sparsity in the 3D point cloud to accelerate computation and save memory, which makes the 3D backbone network achievable. Empirical experiments are conducted on the KITTI benchmark and results show that the proposed method can achieve state-of-the-art performance for 3D object detection.

Lane mark detection is an important element in the road scene analysis for Advanced Driver Assistant System (ADAS). Limited by the onboard computing power, it is still a challenge to reduce system complexity and maintain high accuracy at the same time. In this paper, we propose a Lane Marking Detector (LMD) using a deep convolutional neural network to extract robust lane marking features. To improve its performance with a target of lower complexity, the dilated convolution is adopted. A shallower and thinner structure is designed to decrease the computational cost. Moreover, we also design post-processing algorithms to construct 3rd-order polynomial models to fit into the curved lanes. Our system shows promising results on the captured road scenes.

Generic object detection, aiming at locating object instances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in recent years as powerful methods for learning feature representations directly from data, and have led to remarkable breakthroughs in the field of generic object detection. Given this time of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought by deep learning techniques. More than 250 key contributions are included in this survey, covering many aspects of generic object detection research: leading detection frameworks and fundamental subproblems including object feature representation, object proposal generation, context information modeling and training strategies; evaluation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance. We finish by identifying promising directions for future research.

Feature maps in deep neural network generally contain different semantics. Existing methods often omit their characteristics that may lead to sub-optimal results. In this paper, we propose a novel end-to-end deep saliency network which could effectively utilize multi-scale feature maps according to their characteristics. Shallow layers often contain more local information, and deep layers have advantages in global semantics. Therefore, the network generates elaborate saliency maps by enhancing local and global information of feature maps in different layers. On one hand, local information of shallow layers is enhanced by a recurrent structure which shared convolution kernel at different time steps. On the other hand, global information of deep layers is utilized by a self-attention module, which generates different attention weights for salient objects and backgrounds thus achieve better performance. Experimental results on four widely used datasets demonstrate that our method has advantages in performance over existing algorithms.

We'd like to share a simple tweak of Single Shot Multibox Detector (SSD) family of detectors, which is effective in reducing model size while maintaining the same quality. We share box predictors across all scales, and replace convolution between scales with max pooling. This has two advantages over vanilla SSD: (1) it avoids score miscalibration across scales; (2) the shared predictor sees the training data over all scales. Since we reduce the number of predictors to one, and trim all convolutions between them, model size is significantly smaller. We empirically show that these changes do not hurt model quality compared to vanilla SSD.

Recent CNN based object detectors, no matter one-stage methods like YOLO, SSD, and RetinaNe or two-stage detectors like Faster R-CNN, R-FCN and FPN are usually trying to directly finetune from ImageNet pre-trained models designed for image classification. There has been little work discussing on the backbone feature extractor specifically designed for the object detection. More importantly, there are several differences between the tasks of image classification and object detection. 1. Recent object detectors like FPN and RetinaNet usually involve extra stages against the task of image classification to handle the objects with various scales. 2. Object detection not only needs to recognize the category of the object instances but also spatially locate the position. Large downsampling factor brings large valid receptive field, which is good for image classification but compromises the object location ability. Due to the gap between the image classification and object detection, we propose DetNet in this paper, which is a novel backbone network specifically designed for object detection. Moreover, DetNet includes the extra stages against traditional backbone network for image classification, while maintains high spatial resolution in deeper layers. Without any bells and whistles, state-of-the-art results have been obtained for both object detection and instance segmentation on the MSCOCO benchmark based on our DetNet~(4.8G FLOPs) backbone. The code will be released for the reproduction.

Although Faster R-CNN and its variants have shown promising performance in object detection, they only exploit simple first-order representation of object proposals for final classification and regression. Recent classification methods demonstrate that the integration of high-order statistics into deep convolutional neural networks can achieve impressive improvement, but their goal is to model whole images by discarding location information so that they cannot be directly adopted to object detection. In this paper, we make an attempt to exploit high-order statistics in object detection, aiming at generating more discriminative representations for proposals to enhance the performance of detectors. To this end, we propose a novel Multi-scale Location-aware Kernel Representation (MLKP) to capture high-order statistics of deep features in proposals. Our MLKP can be efficiently computed on a modified multi-scale feature map using a low-dimensional polynomial kernel approximation.Moreover, different from existing orderless global representations based on high-order statistics, our proposed MLKP is location retentive and sensitive so that it can be flexibly adopted to object detection. Through integrating into Faster R-CNN schema, the proposed MLKP achieves very competitive performance with state-of-the-art methods, and improves Faster R-CNN by 4.9% (mAP), 4.7% (mAP) and 5.0% (AP at IOU=[0.5:0.05:0.95]) on PASCAL VOC 2007, VOC 2012 and MS COCO benchmarks, respectively. Code is available at: //github.com/Hwang64/MLKP.

Deep convolutional neural networks have become a key element in the recent breakthrough of salient object detection. However, existing CNN-based methods are based on either patch-wise (region-wise) training and inference or fully convolutional networks. Methods in the former category are generally time-consuming due to severe storage and computational redundancies among overlapping patches. To overcome this deficiency, methods in the second category attempt to directly map a raw input image to a predicted dense saliency map in a single network forward pass. Though being very efficient, it is arduous for these methods to detect salient objects of different scales or salient regions with weak semantic information. In this paper, we develop hybrid contrast-oriented deep neural networks to overcome the aforementioned limitations. Each of our deep networks is composed of two complementary components, including a fully convolutional stream for dense prediction and a segment-level spatial pooling stream for sparse saliency inference. We further propose an attentional module that learns weight maps for fusing the two saliency predictions from these two streams. A tailored alternate scheme is designed to train these deep networks by fine-tuning pre-trained baseline models. Finally, a customized fully connected CRF model incorporating a salient contour feature embedding can be optionally applied as a post-processing step to improve spatial coherence and contour positioning in the fused result from these two streams. Extensive experiments on six benchmark datasets demonstrate that our proposed model can significantly outperform the state of the art in terms of all popular evaluation metrics.

北京阿比特科技有限公司