三级电影一区二区三区,中文字幕无线在线视频观看

Research on damage detection of road surfaces using image processing techniques has been actively conducted, achieving considerably high detection accuracies. Many studies only focus on the detection of the presence or absence of damage. However, in a real-world scenario, when the road managers from a governing body need to repair such damage, they need to clearly understand the type of damage in order to take effective action. In addition, in many of these previous studies, the researchers acquire their own data using different methods. Hence, there is no uniform road damage dataset available openly, leading to the absence of a benchmark for road damage detection. This study makes three contributions to address these issues. First, to the best of our knowledge, for the first time, a large-scale road damage dataset is prepared. This dataset is composed of 9,053 road damage images captured with a smartphone installed on a car, with 15,435 instances of road surface damage included in these road images. In order to generate this dataset, we cooperated with 7 municipalities in Japan and acquired road images for more than 40 hours. These images were captured in a wide variety of weather and illuminance conditions. In each image, we annotated the bounding box representing the location and type of damage. Next, we used a state-of-the-art object detection method using convolutional neural networks to train the damage detection model with our dataset, and compared the accuracy and runtime speed on both, using a GPU server and a smartphone. Finally, we demonstrate that the type of damage can be classified into eight types with high accuracy by applying the proposed object detection method. The road damage dataset, our experimental results, and the developed smartphone application used in this study are publicly available (//github.com/sekilab/RoadDamageDetector/).

相關內容

Neural Networks

關注 1648

神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)（Neural Networks）是世界上三(san)個最古老(lao)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)建模(mo)學(xue)(xue)(xue)會(hui)的(de)(de)(de)(de)檔(dang)案期刊:國(guo)際神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)學(xue)(xue)(xue)會(hui)(INNS)、歐洲神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)學(xue)(xue)(xue)會(hui)(ENNS)和(he)(he)日本神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)學(xue)(xue)(xue)會(hui)(JNNS)。神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)提供(gong)了一(yi)個論(lun)壇，以發(fa)(fa)展和(he)(he)培育一(yi)個國(guo)際社(she)(she)會(hui)的(de)(de)(de)(de)學(xue)(xue)(xue)者和(he)(he)實踐者感(gan)興趣的(de)(de)(de)(de)所(suo)有(you)方面(mian)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)和(he)(he)相關方法(fa)的(de)(de)(de)(de)計算(suan)智(zhi)能。神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)歡迎高質量論(lun)文的(de)(de)(de)(de)提交，有(you)助(zhu)于全面(mian)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)研究(jiu)(jiu)，從(cong)行為(wei)和(he)(he)大腦(nao)建模(mo)，學(xue)(xue)(xue)習算(suan)法(fa)，通過數學(xue)(xue)(xue)和(he)(he)計算(suan)分析，系(xi)統(tong)的(de)(de)(de)(de)工(gong)程(cheng)和(he)(he)技(ji)(ji)術應用(yong)，大量使用(yong)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)的(de)(de)(de)(de)概念和(he)(he)技(ji)(ji)術。這一(yi)獨特而廣(guang)泛的(de)(de)(de)(de)范圍促進了生物(wu)和(he)(he)技(ji)(ji)術研究(jiu)(jiu)之間的(de)(de)(de)(de)思想(xiang)交流，并(bing)有(you)助(zhu)于促進對生物(wu)啟發(fa)(fa)的(de)(de)(de)(de)計算(suan)智(zhi)能感(gan)興趣的(de)(de)(de)(de)跨學(xue)(xue)(xue)科社(she)(she)區的(de)(de)(de)(de)發(fa)(fa)展。因此，神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)編(bian)(bian)委會(hui)代表(biao)的(de)(de)(de)(de)專家領域(yu)包括心理學(xue)(xue)(xue)，神(shen)(shen)經(jing)(jing)(jing)生物(wu)學(xue)(xue)(xue)，計算(suan)機科學(xue)(xue)(xue)，工(gong)程(cheng)，數學(xue)(xue)(xue)，物(wu)理。該雜志(zhi)發(fa)(fa)表(biao)文章、信(xin)件(jian)和(he)(he)評論(lun)以及(ji)給編(bian)(bian)輯的(de)(de)(de)(de)信(xin)件(jian)、社(she)(she)論(lun)、時事、軟件(jian)調查和(he)(he)專利信(xin)息。文章發(fa)(fa)表(biao)在五個部分之一(yi):認知科學(xue)(xue)(xue)，神(shen)(shen)經(jing)(jing)(jing)科學(xue)(xue)(xue)，學(xue)(xue)(xue)習系(xi)統(tong)，數學(xue)(xue)(xue)和(he)(he)計算(suan)分析、工(gong)程(cheng)和(he)(he)應用(yong)。官網(wang)(wang)(wang)地址：

圖像字幕 · Performer · MoDELS · Networking · LSTM ·

2019 年 7 月 2 日

Neural Image Captioning

Elaina Tan,Lakshay Sharma

In recent years, the biggest advances in major Computer Vision tasks, such as object recognition, handwritten-digit identification, facial recognition, and many others., have all come through the use of Convolutional Neural Networks (CNNs). Similarly, in the domain of Natural Language Processing, Recurrent Neural Networks (RNNs), and Long Short Term Memory networks (LSTMs) in particular, have been crucial to some of the biggest breakthroughs in performance for tasks such as machine translation, part-of-speech tagging, sentiment analysis, and many others. These individual advances have greatly benefited tasks even at the intersection of NLP and Computer Vision, and inspired by this success, we studied some existing neural image captioning models that have proven to work well. In this work, we study some existing captioning models that provide near state-of-the-art performances, and try to enhance one such model. We also present a simple image captioning model that makes use of a CNN, an LSTM, and the beam search1 algorithm, and study its performance based on various qualitative and quantitative metrics.

圖像字幕 · MoDELS · Continuity · 真實值 · Performance ·

2019 年 3 月 29 日

Object Hallucination in Image Captioning

Anna Rohrbach,Lisa Anne Hendricks,Kaylee Burns,Trevor Darrell,Kate Saenko

from arxiv, Rohrbach and Hendricks contributed equally; accepted to EMNLP 2018

Despite continuously improving performance, contemporary image captioning models are prone to "hallucinating" objects that are not actually in a scene. One problem is that standard metrics only measure similarity to ground truth captions and may not fully capture image relevance. In this work, we propose a new image relevance metric to evaluate current models with veridical visual labels and assess their rate of object hallucination. We analyze how captioning model architectures and learning objectives contribute to object hallucination, explore when hallucination is likely due to image misclassification or language priors, and assess how well current sentence metrics capture object hallucination. We investigate these questions on the standard image captioning benchmark, MSCOCO, using a diverse set of models. Our analysis yields several interesting findings, including that models which score best on standard sentence metrics do not always have lower hallucination and that models which hallucinate more tend to make errors driven by language priors.

DCGAN · Better · AI與城市 · GANs · 判別器 ·

2019 年 3 月 28 日

Deep Learning based Pedestrian Detection at Distance in Smart Cities

Ranjith K Dinakaran,Philip Easom,Ahmed Bouridane,Li Zhang,Richard Jiang,Fozia Mehboob,Abdul Rauf

from arxiv, Artificial Intelligence Conference 2019 | IntelliSys 2019 | //saiconference.com/IntelliSys

Generative adversarial networks (GANs) have been promising for many computer vision problems due to their powerful capabilities to enhance the data for training and test. In this paper, we leveraged GANs and proposed a new architecture with a cascaded Single Shot Detector (SSD) for pedestrian detection at distance, which is yet a challenge due to the varied sizes of pedestrians in videos at distance. To overcome the low-resolution issues in pedestrian detection at distance, DCGAN is employed to improve the resolution first to reconstruct more discriminative features for a SSD to detect objects in images or videos. A crucial advantage of our method is that it learns a multi-scale metric to distinguish multiple objects at different distances under one image, while DCGAN serves as an encoder-decoder platform to generate parts of an image that contain better discriminative information. To measure the effectiveness of our proposed method, experiments were carried out on the Canadian Institute for Advanced Research (CIFAR) dataset, and it was demonstrated that the proposed new architecture achieved a much better detection rate, particularly on vehicles and pedestrians at distance, making it highly suitable for smart cities applications that need to discover key objects or pedestrians at distance.

數據增強 · 圖像分割 · state-of-the-art · 樣例 · 標注 ·

2019 年 2 月 25 日

Data augmentation using learned transforms for one-shot medical image segmentation

Amy Zhao,Guha Balakrishnan,Frédo Durand,John V. Guttag,Adrian V. Dalca

from arxiv, 9 pages, CVPR 2019

Biomedical image segmentation is an important task in many medical applications. Segmentation methods based on convolutional neural networks attain state-of-the-art accuracy; however, they typically rely on supervised training with large labeled datasets. Labeling datasets of medical images requires significant expertise and time, and is infeasible at large scales. To tackle the lack of labeled data, researchers use techniques such as hand-engineered preprocessing steps, hand-tuned architectures, and data augmentation. However, these techniques involve costly engineering efforts, and are typically dataset-specific. We present an automated data augmentation method for medical images. We demonstrate our method on the task of segmenting magnetic resonance imaging (MRI) brain scans, focusing on the one-shot segmentation scenario -- a practical challenge in many medical applications. Our method requires only a single segmented scan, and leverages other unlabeled scans in a semi-supervised approach. We learn a model of transforms from the images, and use the model along with the labeled example to synthesize additional labeled training examples for supervised segmentation. Each transform is comprised of a spatial deformation field and an intensity change, enabling the synthesis of complex effects such as variations in anatomy and image acquisition procedures. Augmenting the training of a supervised segmenter with these new examples provides significant improvements over state-of-the-art methods for one-shot biomedical image segmentation. Our code is available at //github.com/xamyzhao/brainstorm.

Performer · Networking · Automator · 卷積 · 泛化理論 ·

2018 年 9 月 17 日

Apple Flower Detection using Deep Convolutional Networks

Philipe A. Dias,Amy Tabb,Henry Medeiros

from arxiv, 14 pages

To optimize fruit production, a portion of the flowers and fruitlets of apple trees must be removed early in the growing season. The proportion to be removed is determined by the bloom intensity, i.e., the number of flowers present in the orchard. Several automated computer vision systems have been proposed to estimate bloom intensity, but their overall performance is still far from satisfactory even in relatively controlled environments. With the goal of devising a technique for flower identification which is robust to clutter and to changes in illumination, this paper presents a method in which a pre-trained convolutional neural network is fine-tuned to become specially sensitive to flowers. Experimental results on a challenging dataset demonstrate that our method significantly outperforms three approaches that represent the state of the art in flower detection, with recall and precision rates higher than $90\%$. Moreover, a performance assessment on three additional datasets previously unseen by the network, which consist of different flower species and were acquired under different conditions, reveals that the proposed method highly surpasses baseline approaches in terms of generalization capability.

目標檢測 · 學成 · 冪法 · 可辨認的 · 深度學習 ·

2018 年 9 月 6 日

Deep Learning for Generic Object Detection: A Survey

Li Liu,Wanli Ouyang,Xiaogang Wang,Paul Fieguth,Jie Chen,Xinwang Liu,Matti Pietik?inen

from arxiv, Submitted to IJCV, 30pages

Generic object detection, aiming at locating object instances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in recent years as powerful methods for learning feature representations directly from data, and have led to remarkable breakthroughs in the field of generic object detection. Given this time of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought by deep learning techniques. More than 250 key contributions are included in this survey, covering many aspects of generic object detection research: leading detection frameworks and fundamental subproblems including object feature representation, object proposal generation, context information modeling and training strategies; evaluation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance. We finish by identifying promising directions for future research.

Neural Networks · Networking · 卷積神經網絡 · 卷積 · SSD ·

2018 年 7 月 16 日

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Amir Soleimani,Nasser M. Nasrabadi

from arxiv, This paper has been accepted in the 21st International Conference on Information Fusion and would be indexed in IEEE

The low resolution of objects of interest in aerial images makes pedestrian detection and action detection extremely challenging tasks. Furthermore, using deep convolutional neural networks to process large images can be demanding in terms of computational requirements. In order to alleviate these challenges, we propose a two-step, yes and no question answering framework to find specific individuals doing one or multiple specific actions in aerial images. First, a deep object detector, Single Shot Multibox Detector (SSD), is used to generate object proposals from small aerial images. Second, another deep network, is used to learn a latent common sub-space which associates the high resolution aerial imagery and the pedestrian action labels that are provided by the human-based sources

視頻分類 · 視頻描述生成（Video Caption） · INFORMS · AIM · 深度學習 ·

2018 年 2 月 22 日

Deep Learning for Video Classification and Captioning

Zuxuan Wu,Ting Yao,Yanwei Fu,Yu-Gang Jiang

from arxiv, Book chapter in Frontiers of Multimedia Research

Accelerated by the tremendous increase in Internet bandwidth and storage space, video data has been generated, published and spread explosively, becoming an indispensable part of today's big data. In this paper, we focus on reviewing two lines of research aiming to stimulate the comprehension of videos with deep learning: video classification and video captioning. While video classification concentrates on automatically labeling video clips based on their semantic contents like human actions or complex events, video captioning attempts to generate a complete and natural sentence, enriching the single label as in video classification, to capture the most informative dynamics in videos. In addition, we also provide a review of popular benchmarks and competitions, which are critical for evaluating the technical progress of this vibrant field.

目標檢測 · Vision · 地球 · 數據集 · state-of-the-art ·

2018 年 1 月 27 日

DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Gui-Song Xia,Xiang Bai,Jian Ding,Zhen Zhu,Serge Belongie,Jiebo Luo,Mihai Datcu,Marcello Pelillo,Liangpei Zhang

Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of the huge variation in the scale, orientation and shape of the object instances on the earth's surface, but also due to the scarcity of well-annotated datasets of objects in aerial scenes. To advance object detection research in Earth Vision, also known as Earth Observation and Remote Sensing, we introduce a large-scale Dataset for Object deTection in Aerial images (DOTA). To this end, we collect $2806$ aerial images from different sensors and platforms. Each image is of the size about 4000-by-4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using $15$ common object categories. The fully annotated DOTA images contains $188,282$ instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral To build a baseline for object detection in Earth Vision, we evaluate state-of-the-art object detection algorithms on DOTA. Experiments demonstrate that DOTA well represents real Earth Vision applications and are quite challenging.

圖像字幕 · Performer · MoDELS · Processing（編程語言） · 對象識別 ·

2018 年 1 月 17 日

Image Captioning using Deep Neural Architectures

Parth Shah,Vishvajit Bakarola,Supriya Pati

from arxiv, Pre-print version of paper accepted at 2017 International Conference on Innovations in information Embedded and Communication Systems (ICIIECS)

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about different available models for image captioning task. We have also discussed about how the advancement in the task of object recognition and machine translation has greatly improved the performance of image captioning model in recent years. In addition to that we have discussed how this model can be implemented. In the end, we have also evaluated the performance of model using standard evaluation matrices.