亚洲AV午夜成人片精品网站听书_亚洲国产中文精品在线观看香蕉_久久人人爽人人爽人人片AV图片_国产精品久久国产精品99盘_爽爽爽精品视频一区二区_日韩欧美一线本在线播放_极品少妇被猛的白浆直喷白浆免费

Text line segmentation is one of the pre-stages of modern optical character recognition systems. The algorithmic approach proposed by this paper has been designed for this exact purpose. Its main characteristic is the combination of two different techniques, morphological image operations and horizontal histogram projections. The method was developed to be applied on a historic data collection that commonly features quality issues, such as degraded paper, blurred text, or presence of noise. For that reason, the segmenter in question could be of particular interest for cultural institutions, that want access to robust line bounding boxes for a given historic document. Because of the promising segmentation results that are joined by low computational cost, the algorithm was incorporated into the OCR pipeline of the National Library of Luxembourg, in the context of the initiative of reprocessing their historic newspaper collection. The general contribution of this paper is to outline the approach and to evaluate the gains in terms of accuracy and speed, comparing it to the segmentation algorithm bundled with the used open source OCR software.

相關內容

光(guang)學字符識別

關注 44

OCR （Optical Character Recognition，光學(xue)字(zi)(zi)符(fu)(fu)(fu)識(shi)別）是指電子設備（例如掃描(miao)儀或數碼相機(ji)）檢查紙(zhi)上打印的(de)字(zi)(zi)符(fu)(fu)(fu)，通(tong)過(guo)檢測暗、亮的(de)模(mo)式確定其形狀(zhuang)，然(ran)后用字(zi)(zi)符(fu)(fu)(fu)識(shi)別方法將形狀(zhuang)翻譯成計算機(ji)文(wen)(wen)(wen)字(zi)(zi)的(de)過(guo)程；即，針對印刷(shua)體字(zi)(zi)符(fu)(fu)(fu)，采用光學(xue)的(de)方式將紙(zhi)質文(wen)(wen)(wen)檔(dang)中(zhong)的(de)文(wen)(wen)(wen)字(zi)(zi)轉(zhuan)換成為黑白點陣的(de)圖像文(wen)(wen)(wen)件(jian)(jian)，并通(tong)過(guo)識(shi)別軟件(jian)(jian)將圖像中(zhong)的(de)文(wen)(wen)(wen)字(zi)(zi)轉(zhuan)換成文(wen)(wen)(wen)本(ben)格式，供文(wen)(wen)(wen)字(zi)(zi)處理軟件(jian)(jian)進一(yi)步編輯加工的(de)技術。

秩 · MoDELS · INFORMS · 學成 · 情景 ·

2021 年 11 月 2 日

Neural ranking models for document retrieval

Mohamed Trabelsi,Zhiyu Chen,Brian D. Davison,Jeff Heflin

from arxiv, Published in the Information Retrieval Journal (2021)

Ranking models are the main components of information retrieval systems. Several approaches to ranking are based on traditional machine learning algorithms using a set of hand-crafted features. Recently, researchers have leveraged deep learning models in information retrieval. These models are trained end-to-end to extract features from the raw data for ranking tasks, so that they overcome the limitations of hand-crafted features. A variety of deep learning models have been proposed, and each model presents a set of neural network components to extract features that are used for ranking. In this paper, we compare the proposed models in the literature along different dimensions in order to understand the major contributions and limitations of each model. In our discussion of the literature, we analyze the promising neural components, and propose future research directions. We also show the analogy between document retrieval and other retrieval tasks where the items to be ranked are structured documents, answers, images and videos.

2021 年 11 月 1 日

Comparing Machine Learning based Segmentation Models on Jet Fire Radiation Zones

Carmina Pérez-Guerrero,Adriana Palacios,Gilberto Ochoa-Ruiz,Christian Mata,Miguel Gonzalez-Mendoza,Luis Eduardo Falcón-Morales

Risk assessment is relevant in any workplace, however there is a degree of unpredictability when dealing with flammable or hazardous materials so that detection of fire accidents by itself may not be enough. An example of this is the impingement of jet fires, where the heat fluxes of the flame could reach nearby equipment and dramatically increase the probability of a domino effect with catastrophic results. Because of this, the characterization of such fire accidents is important from a risk management point of view. One such characterization would be the segmentation of different radiation zones within the flame, so this paper presents an exploratory research regarding several traditional computer vision and Deep Learning segmentation approaches to solve this specific problem. A data set of propane jet fires is used to train and evaluate the different approaches and given the difference in the distribution of the zones and background of the images, different loss functions, that seek to alleviate data imbalance, are also explored. Additionally, different metrics are correlated to a manual ranking performed by experts to make an evaluation that closely resembles the expert's criteria. The Hausdorff Distance and Adjusted Random Index were the metrics with the highest correlation and the best results were obtained from the UNet architecture with a Weighted Cross-Entropy Loss. These results can be used in future research to extract more geometric information from the segmentation masks or could even be implemented on other types of fire accidents.

損失函數（機器學習） · SOFT · 圖像分割 · 真實值 · 泛函 ·

2021 年 10 月 31 日

Incorporating Boundary Uncertainty into loss functions for biomedical image segmentation

Michael Yeung,Guang Yang,Evis Sala,Carola-Bibiane Sch?nlieb,Leonardo Rundo

Manual segmentation is used as the gold-standard for evaluating neural networks on automated image segmentation tasks. Due to considerable heterogeneity in shapes, colours and textures, demarcating object boundaries is particularly difficult in biomedical images, resulting in significant inter and intra-rater variability. Approaches, such as soft labelling and distance penalty term, apply a global transformation to the ground truth, redefining the loss function with respect to uncertainty. However, global operations are computationally expensive, and neither approach accurately reflects the uncertainty underlying manual annotation. In this paper, we propose the Boundary Uncertainty, which uses morphological operations to restrict soft labelling to object boundaries, providing an appropriate representation of uncertainty in ground truth labels, and may be adapted to enable robust model training where systematic manual segmentation errors are present. We incorporate Boundary Uncertainty with the Dice loss, achieving consistently improved performance across three well-validated biomedical imaging datasets compared to soft labelling and distance-weighted penalty. Boundary Uncertainty not only more accurately reflects the segmentation process, but it is also efficient, robust to segmentation errors and exhibits better generalisation.

塑造 · 推斷 · 模式識別 · 計算機視覺 ·

2020 年 12 月 10 日

Amodal Segmentation Based on Visible Region Segmentation and Shape Prior

Yuting Xiao,Yanyu Xu,Ziming Zhong,Weixin Luo,Jiawei Li,Shenghua Gao

from arxiv, Accepted by AAAI 2021

Almost all existing amodal segmentation methods make the inferences of occluded regions by using features corresponding to the whole image.

contrastive · 學成 · 對比學習 · Extensibility · SSL ·

2020 年 6 月 18 日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Krishna Chaitanya,Ertunc Erdil,Neerav Karani,Ender Konukoglu

from arxiv, 16 pages, 2 figures, 7 tables. This article is a pre-print and is currently under review at a conference

A key requirement for the success of supervised deep learning is a large labeled dataset - a condition that is difficult to meet in medical image analysis. Self-supervised learning (SSL) can help in this regard by providing a strategy to pre-train a neural network with unlabeled data, followed by fine-tuning for a downstream task with limited annotations. Contrastive learning, a particular variant of SSL, is a powerful technique for learning image-level representations. In this work, we propose strategies for extending the contrastive learning framework for segmentation of volumetric medical images in the semi-supervised setting with limited annotations, by leveraging domain-specific and problem-specific cues. Specifically, we propose (1) novel contrasting strategies that leverage structural similarity across volumetric medical images (domain-specific cue) and (2) a local version of the contrastive loss to learn distinctive representations of local regions that are useful for per-pixel segmentation (problem-specific cue). We carry out an extensive evaluation on three Magnetic Resonance Imaging (MRI) datasets. In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques. When combined with a simple data augmentation technique, the proposed method reaches within 8% of benchmark performance using only two labeled MRI volumes for training, corresponding to only 4% (for ACDC) of the training data used to train the benchmark.

Branch · Networking · 示例 · Better · 可理解性 ·

2019 年 4 月 10 日

S4Net: Single Stage Salient-Instance Segmentation

Ruochen Fan,Ming-Ming Cheng,Qibin Hou,Tai-Jiang Mu,Jingdong Wang,Shi-Min Hu

We consider an interesting problem-salient instance segmentation in this paper. Other than producing bounding boxes, our network also outputs high-quality instance-level segments. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also its surrounding context, enabling us to distinguish the instances in the same scope even with obstruction. Our network is end-to-end trainable and runs at a fast speed (40 fps when processing an image with resolution 320x320). We evaluate our approach on a publicly available benchmark and show that it outperforms other alternative solutions. We also provide a thorough analysis of the design choices to help readers better understand the functions of each part of our network. The source code can be found at \url{//github.com/RuochenFan/S4Net}.

MoDELS · Performer · 解碼 · SOFT · 生成模型 ·

2019 年 4 月 9 日

Text Generation with Exemplar-based Adaptive Decoding

Hao Peng,Ankur P. Parikh,Manaal Faruqui,Bhuwan Dhingra,Dipanjan Das

from arxiv, NAACL 2019

We propose a novel conditioned text generation model. It draws inspiration from traditional template-based text generation techniques, where the source provides the content (i.e., what to say), and the template influences how to say it. Building on the successful encoder-decoder paradigm, it first encodes the content representation from the given input text; to produce the output, it retrieves exemplar text from the training data as "soft templates," which are then used to construct an exemplar-specific decoder. We evaluate the proposed model on abstractive text summarization and data-to-text generation. Empirical results show that this model achieves strong performance and outperforms comparable baselines.

圖像分類器 · 測試樣本 · 模型評估 · Networking · ImageNet (數據集) ·

2018 年 8 月 15 日

Building medical image classifiers with very limited data using segmentation networks

Ken C. L. Wong,Tanveer Syeda-Mahmood,Mehdi Moradi

from arxiv, This paper was accepted by Medical Image Analysis

Deep learning has shown promising results in medical image analysis, however, the lack of very large annotated datasets confines its full potential. Although transfer learning with ImageNet pre-trained classification models can alleviate the problem, constrained image sizes and model complexities can lead to unnecessary increase in computational cost and decrease in performance. As many common morphological features are usually shared by different classification tasks of an organ, it is greatly beneficial if we can extract such features to improve classification with limited samples. Therefore, inspired by the idea of curriculum learning, we propose a strategy for building medical image classifiers using features from segmentation networks. By using a segmentation network pre-trained on similar data as the classification task, the machine can first learn the simpler shape and structural concepts before tackling the actual classification problem which usually involves more complicated concepts. Using our proposed framework on a 3D three-class brain tumor type classification problem, we achieved 82% accuracy on 191 testing samples with 91 training samples. When applying to a 2D nine-class cardiac semantic level classification problem, we achieved 86% accuracy on 263 testing samples with 108 training samples. Comparisons with ImageNet pre-trained classifiers and classifiers trained from scratch are presented.

話題模型 · 詞義消歧 · 同義詞集 · CC · 潛在狄利克雷分配 ·

2018 年 1 月 5 日

Knowledge-based Word Sense Disambiguation using Topic Models

Devendra Singh Chaplot,Ruslan Salakhutdinov

from arxiv, To appear in AAAI-18

Word Sense Disambiguation is an open problem in Natural Language Processing which is particularly challenging and useful in the unsupervised setting where all the words in any given text need to be disambiguated without using any labeled data. Typically WSD systems use the sentence or a small window of words around the target word as the context for disambiguation because their computational complexity scales exponentially with the size of the context. In this paper, we leverage the formalism of topic model to design a WSD system that scales linearly with the number of words in the context. As a result, our system is able to utilize the whole document as the context for a word to be disambiguated. The proposed method is a variant of Latent Dirichlet Allocation in which the topic proportions for a document are replaced by synset proportions. We further utilize the information in the WordNet by assigning a non-uniform prior to synset distribution over words and a logistic-normal prior for document distribution over synsets. We evaluate the proposed method on Senseval-2, Senseval-3, SemEval-2007, SemEval-2013 and SemEval-2015 English All-Word WSD datasets and show that it outperforms the state-of-the-art unsupervised knowledge-based WSD system by a significant margin.

判別器 · Color · Oracle · 多媒體 · 模式識別 ·

2012 年 11 月 20 日

Content based video retrieval

B. V. Patel,B. B. Meshram

Content based video retrieval is an approach for facilitating the searching and browsing of large image collections over World Wide Web. In this approach, video analysis is conducted on low level visual properties extracted from video frame. We believed that in order to create an effective video retrieval system, visual perception must be taken into account. We conjectured that a technique which employs multiple features for indexing and retrieval would be more effective in the discrimination and search tasks of videos. In order to validate this claim, content based indexing and retrieval systems were implemented using color histogram, various texture features and other approaches. Videos were stored in Oracle 9i Database and a user study measured correctness of response.