亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Research on damage detection of road surfaces using image processing techniques has been actively conducted, achieving considerably high detection accuracies. Many studies only focus on the detection of the presence or absence of damage. However, in a real-world scenario, when the road managers from a governing body need to repair such damage, they need to clearly understand the type of damage in order to take effective action. In addition, in many of these previous studies, the researchers acquire their own data using different methods. Hence, there is no uniform road damage dataset available openly, leading to the absence of a benchmark for road damage detection. This study makes three contributions to address these issues. First, to the best of our knowledge, for the first time, a large-scale road damage dataset is prepared. This dataset is composed of 9,053 road damage images captured with a smartphone installed on a car, with 15,435 instances of road surface damage included in these road images. In order to generate this dataset, we cooperated with 7 municipalities in Japan and acquired road images for more than 40 hours. These images were captured in a wide variety of weather and illuminance conditions. In each image, we annotated the bounding box representing the location and type of damage. Next, we used a state-of-the-art object detection method using convolutional neural networks to train the damage detection model with our dataset, and compared the accuracy and runtime speed on both, using a GPU server and a smartphone. Finally, we demonstrate that the type of damage can be classified into eight types with high accuracy by applying the proposed object detection method. The road damage dataset, our experimental results, and the developed smartphone application used in this study are publicly available (//github.com/sekilab/RoadDamageDetector/).

相關內容

神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)(Neural Networks)是世界上三(san)個最古老(lao)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)建模(mo)學(xue)(xue)(xue)會(hui)的(de)(de)(de)(de)檔(dang)案期刊:國(guo)際神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)學(xue)(xue)(xue)會(hui)(INNS)、歐洲神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)學(xue)(xue)(xue)會(hui)(ENNS)和(he)(he)日本神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)學(xue)(xue)(xue)會(hui)(JNNS)。神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)提供(gong)了一(yi)個論(lun)壇,以發(fa)(fa)展和(he)(he)培育一(yi)個國(guo)際社(she)(she)會(hui)的(de)(de)(de)(de)學(xue)(xue)(xue)者和(he)(he)實踐者感(gan)興趣的(de)(de)(de)(de)所(suo)有(you)方面(mian)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)和(he)(he)相關方法(fa)的(de)(de)(de)(de)計算(suan)智(zhi)能。神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)歡迎高質量論(lun)文的(de)(de)(de)(de)提交,有(you)助(zhu)于全面(mian)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)研究(jiu)(jiu),從(cong)行為(wei)和(he)(he)大腦(nao)建模(mo),學(xue)(xue)(xue)習算(suan)法(fa),通過數學(xue)(xue)(xue)和(he)(he)計算(suan)分析,系(xi)統(tong)的(de)(de)(de)(de)工(gong)程(cheng)和(he)(he)技(ji)(ji)術應用(yong),大量使用(yong)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)的(de)(de)(de)(de)概念和(he)(he)技(ji)(ji)術。這一(yi)獨特而廣(guang)泛的(de)(de)(de)(de)范圍促進了生物(wu)和(he)(he)技(ji)(ji)術研究(jiu)(jiu)之間的(de)(de)(de)(de)思想(xiang)交流,并(bing)有(you)助(zhu)于促進對生物(wu)啟發(fa)(fa)的(de)(de)(de)(de)計算(suan)智(zhi)能感(gan)興趣的(de)(de)(de)(de)跨學(xue)(xue)(xue)科社(she)(she)區的(de)(de)(de)(de)發(fa)(fa)展。因此,神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(luo)(luo)(luo)編(bian)(bian)委會(hui)代表(biao)的(de)(de)(de)(de)專家領域(yu)包括心理學(xue)(xue)(xue),神(shen)(shen)經(jing)(jing)(jing)生物(wu)學(xue)(xue)(xue),計算(suan)機科學(xue)(xue)(xue),工(gong)程(cheng),數學(xue)(xue)(xue),物(wu)理。該雜志(zhi)發(fa)(fa)表(biao)文章、信(xin)件(jian)和(he)(he)評論(lun)以及(ji)給編(bian)(bian)輯的(de)(de)(de)(de)信(xin)件(jian)、社(she)(she)論(lun)、時事、軟件(jian)調查和(he)(he)專利信(xin)息。文章發(fa)(fa)表(biao)在五個部分之一(yi):認知科學(xue)(xue)(xue),神(shen)(shen)經(jing)(jing)(jing)科學(xue)(xue)(xue),學(xue)(xue)(xue)習系(xi)統(tong),數學(xue)(xue)(xue)和(he)(he)計算(suan)分析、工(gong)程(cheng)和(he)(he)應用(yong)。 官網(wang)(wang)(wang)地址:

In recent years, the biggest advances in major Computer Vision tasks, such as object recognition, handwritten-digit identification, facial recognition, and many others., have all come through the use of Convolutional Neural Networks (CNNs). Similarly, in the domain of Natural Language Processing, Recurrent Neural Networks (RNNs), and Long Short Term Memory networks (LSTMs) in particular, have been crucial to some of the biggest breakthroughs in performance for tasks such as machine translation, part-of-speech tagging, sentiment analysis, and many others. These individual advances have greatly benefited tasks even at the intersection of NLP and Computer Vision, and inspired by this success, we studied some existing neural image captioning models that have proven to work well. In this work, we study some existing captioning models that provide near state-of-the-art performances, and try to enhance one such model. We also present a simple image captioning model that makes use of a CNN, an LSTM, and the beam search1 algorithm, and study its performance based on various qualitative and quantitative metrics.

Despite continuously improving performance, contemporary image captioning models are prone to "hallucinating" objects that are not actually in a scene. One problem is that standard metrics only measure similarity to ground truth captions and may not fully capture image relevance. In this work, we propose a new image relevance metric to evaluate current models with veridical visual labels and assess their rate of object hallucination. We analyze how captioning model architectures and learning objectives contribute to object hallucination, explore when hallucination is likely due to image misclassification or language priors, and assess how well current sentence metrics capture object hallucination. We investigate these questions on the standard image captioning benchmark, MSCOCO, using a diverse set of models. Our analysis yields several interesting findings, including that models which score best on standard sentence metrics do not always have lower hallucination and that models which hallucinate more tend to make errors driven by language priors.

Generative adversarial networks (GANs) have been promising for many computer vision problems due to their powerful capabilities to enhance the data for training and test. In this paper, we leveraged GANs and proposed a new architecture with a cascaded Single Shot Detector (SSD) for pedestrian detection at distance, which is yet a challenge due to the varied sizes of pedestrians in videos at distance. To overcome the low-resolution issues in pedestrian detection at distance, DCGAN is employed to improve the resolution first to reconstruct more discriminative features for a SSD to detect objects in images or videos. A crucial advantage of our method is that it learns a multi-scale metric to distinguish multiple objects at different distances under one image, while DCGAN serves as an encoder-decoder platform to generate parts of an image that contain better discriminative information. To measure the effectiveness of our proposed method, experiments were carried out on the Canadian Institute for Advanced Research (CIFAR) dataset, and it was demonstrated that the proposed new architecture achieved a much better detection rate, particularly on vehicles and pedestrians at distance, making it highly suitable for smart cities applications that need to discover key objects or pedestrians at distance.

Biomedical image segmentation is an important task in many medical applications. Segmentation methods based on convolutional neural networks attain state-of-the-art accuracy; however, they typically rely on supervised training with large labeled datasets. Labeling datasets of medical images requires significant expertise and time, and is infeasible at large scales. To tackle the lack of labeled data, researchers use techniques such as hand-engineered preprocessing steps, hand-tuned architectures, and data augmentation. However, these techniques involve costly engineering efforts, and are typically dataset-specific. We present an automated data augmentation method for medical images. We demonstrate our method on the task of segmenting magnetic resonance imaging (MRI) brain scans, focusing on the one-shot segmentation scenario -- a practical challenge in many medical applications. Our method requires only a single segmented scan, and leverages other unlabeled scans in a semi-supervised approach. We learn a model of transforms from the images, and use the model along with the labeled example to synthesize additional labeled training examples for supervised segmentation. Each transform is comprised of a spatial deformation field and an intensity change, enabling the synthesis of complex effects such as variations in anatomy and image acquisition procedures. Augmenting the training of a supervised segmenter with these new examples provides significant improvements over state-of-the-art methods for one-shot biomedical image segmentation. Our code is available at //github.com/xamyzhao/brainstorm.

To optimize fruit production, a portion of the flowers and fruitlets of apple trees must be removed early in the growing season. The proportion to be removed is determined by the bloom intensity, i.e., the number of flowers present in the orchard. Several automated computer vision systems have been proposed to estimate bloom intensity, but their overall performance is still far from satisfactory even in relatively controlled environments. With the goal of devising a technique for flower identification which is robust to clutter and to changes in illumination, this paper presents a method in which a pre-trained convolutional neural network is fine-tuned to become specially sensitive to flowers. Experimental results on a challenging dataset demonstrate that our method significantly outperforms three approaches that represent the state of the art in flower detection, with recall and precision rates higher than $90\%$. Moreover, a performance assessment on three additional datasets previously unseen by the network, which consist of different flower species and were acquired under different conditions, reveals that the proposed method highly surpasses baseline approaches in terms of generalization capability.

Generic object detection, aiming at locating object instances from a large number of predefined categories in natural images, is one of the most fundamental and challenging problems in computer vision. Deep learning techniques have emerged in recent years as powerful methods for learning feature representations directly from data, and have led to remarkable breakthroughs in the field of generic object detection. Given this time of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought by deep learning techniques. More than 250 key contributions are included in this survey, covering many aspects of generic object detection research: leading detection frameworks and fundamental subproblems including object feature representation, object proposal generation, context information modeling and training strategies; evaluation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance. We finish by identifying promising directions for future research.

The low resolution of objects of interest in aerial images makes pedestrian detection and action detection extremely challenging tasks. Furthermore, using deep convolutional neural networks to process large images can be demanding in terms of computational requirements. In order to alleviate these challenges, we propose a two-step, yes and no question answering framework to find specific individuals doing one or multiple specific actions in aerial images. First, a deep object detector, Single Shot Multibox Detector (SSD), is used to generate object proposals from small aerial images. Second, another deep network, is used to learn a latent common sub-space which associates the high resolution aerial imagery and the pedestrian action labels that are provided by the human-based sources

Accelerated by the tremendous increase in Internet bandwidth and storage space, video data has been generated, published and spread explosively, becoming an indispensable part of today's big data. In this paper, we focus on reviewing two lines of research aiming to stimulate the comprehension of videos with deep learning: video classification and video captioning. While video classification concentrates on automatically labeling video clips based on their semantic contents like human actions or complex events, video captioning attempts to generate a complete and natural sentence, enriching the single label as in video classification, to capture the most informative dynamics in videos. In addition, we also provide a review of popular benchmarks and competitions, which are critical for evaluating the technical progress of this vibrant field.

Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of the huge variation in the scale, orientation and shape of the object instances on the earth's surface, but also due to the scarcity of well-annotated datasets of objects in aerial scenes. To advance object detection research in Earth Vision, also known as Earth Observation and Remote Sensing, we introduce a large-scale Dataset for Object deTection in Aerial images (DOTA). To this end, we collect $2806$ aerial images from different sensors and platforms. Each image is of the size about 4000-by-4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using $15$ common object categories. The fully annotated DOTA images contains $188,282$ instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral To build a baseline for object detection in Earth Vision, we evaluate state-of-the-art object detection algorithms on DOTA. Experiments demonstrate that DOTA well represents real Earth Vision applications and are quite challenging.

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about different available models for image captioning task. We have also discussed about how the advancement in the task of object recognition and machine translation has greatly improved the performance of image captioning model in recent years. In addition to that we have discussed how this model can be implemented. In the end, we have also evaluated the performance of model using standard evaluation matrices.

北京阿比特科技有限公司