The fingerprint classification is an important and effective method to quicken the process and improve the accuracy in the fingerprint matching process. Conventional supervised methods need a large amount of pre-labeled data and thus consume immense human resources. In this paper, we propose a new and efficient unsupervised deep learning method that can extract fingerprint features and classify fingerprint patterns automatically. In this approach, a new model named constraint convolutional auto-encoder (CCAE) is used to extract fingerprint features and a hybrid clustering strategy is applied to obtain the final clusters. A set of experiments in the NIST-DB4 dataset shows that the proposed unsupervised method exhibits the efficient performance on fingerprint classification. For example, the CCAE achieves an accuracy of 97.3% on only 1000 unlabeled fingerprints in the NIST-DB4.
It is known that unsupervised nonlinear dimensionality reduction and clustering is sensitive to the selection of hyperparameters, particularly for deep learning based methods, which hinder its practical use. How to select a proper network structure that may be dramatically different in different applications is a hard issue for deep models, given little prior knowledge of data. In this paper, we explore ensemble learning and selection techniques for automatically determining the optimal network structure of a deep model, named multilayer bootstrap networks (MBN). Specifically, we first propose an MBN ensemble (MBN-E) algorithm which concatenates the sparse outputs of a set of MBN base models with different network structures into a new representation. Because training an ensemble of MBN is expensive, we propose a fast version of MBN-E (fMBN-E), which replaces the step of random data resampling in MBN-E by the resampling of random similarity scores. Theoretically, fMBN-E is even faster than a single standard MBN. Then, we take the new representation produced by MBN-E as a reference for selecting the optimal MBN base models. Two kinds of ensemble selection criteria, named optimization-like selection criteria and distribution divergence criteria, are applied. Importantly, MBN-E and its ensemble selection techniques maintain the simple formulation of MBN that is based on one-nearest-neighbor learning, and reach the state-of-the-art performance without manual hyperparameter tuning. fMBN-E is empirically even hundreds of times faster than MBN-E without suffering performance degradation. The source code is available at //www.xiaolei-zhang.net/mbn-e.htm.
Work in deep clustering focuses on finding a single partition of data. However, high-dimensional data, such as images, typically feature multiple interesting characteristics one could cluster over. For example, images of objects against a background could be clustered over the shape of the object and separately by the colour of the background. In this paper, we introduce Multi-Facet Clustering Variational Autoencoders (MFCVAE), a novel class of variational autoencoders with a hierarchy of latent variables, each with a Mixture-of-Gaussians prior, that learns multiple clusterings simultaneously, and is trained fully unsupervised and end-to-end. MFCVAE uses a progressively-trained ladder architecture which leads to highly stable performance. We provide novel theoretical results for optimising the ELBO analytically with respect to the categorical variational posterior distribution, correcting earlier influential theoretical work. On image benchmarks, we demonstrate that our approach separates out and clusters over different aspects of the data in a disentangled manner. We also show other advantages of our model: the compositionality of its latent space and that it provides controlled generation of samples.
Network pruning reduces the size of neural networks by removing (pruning) neurons such that the performance drop is minimal. Traditional pruning approaches focus on designing metrics to quantify the usefulness of a neuron which is often quite tedious and sub-optimal. More recent approaches have instead focused on training auxiliary networks to automatically learn how useful each neuron is however, they often do not take computational limitations into account. In this work, we propose a general methodology for pruning neural networks. Our proposed methodology can prune neural networks to respect pre-defined computational budgets on arbitrary, possibly non-differentiable, functions. Furthermore, we only assume the ability to be able to evaluate these functions for different inputs, and hence they do not need to be fully specified beforehand. We achieve this by proposing a novel pruning strategy via constrained reinforcement learning algorithms. We prove the effectiveness of our approach via comparison with state-of-the-art methods on standard image classification datasets. Specifically, we reduce 83-92.90 of total parameters on various variants of VGG while achieving comparable or better performance than that of original networks. We also achieved 75.09 reduction in parameters on ResNet18 without incurring any loss in accuracy.
Clustering is a fundamental task in data analysis. Recently, deep clustering, which derives inspiration primarily from deep learning approaches, achieves state-of-the-art performance and has attracted considerable attention. Current deep clustering methods usually boost the clustering results by means of the powerful representation ability of deep learning, e.g., autoencoder, suggesting that learning an effective representation for clustering is a crucial requirement. The strength of deep clustering methods is to extract the useful representations from the data itself, rather than the structure of data, which receives scarce attention in representation learning. Motivated by the great success of Graph Convolutional Network (GCN) in encoding the graph structure, we propose a Structural Deep Clustering Network (SDCN) to integrate the structural information into deep clustering. Specifically, we design a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer, and a dual self-supervised mechanism to unify these two different deep neural architectures and guide the update of the whole model. In this way, the multiple structures of data, from low-order to high-order, are naturally combined with the multiple representations learned by autoencoder. Furthermore, we theoretically analyze the delivery operator, i.e., with the delivery operator, GCN improves the autoencoder-specific representation as a high-order graph regularization constraint and autoencoder helps alleviate the over-smoothing problem in GCN. Through comprehensive experiments, we demonstrate that our propose model can consistently perform better over the state-of-the-art techniques.
We study the impact of neural networks in text classification. Our focus is on training deep neural networks with proper weight initialization and greedy layer-wise pretraining. Results are compared with 1-layer neural networks and Support Vector Machines. We work with a dataset of labeled messages from the Twitter microblogging service and aim to predict weather conditions. A feature extraction procedure specific for the task is proposed, which applies dimensionality reduction using Latent Semantic Analysis. Our results show that neural networks outperform Support Vector Machines with Gaussian kernels, noticing performance gains from introducing additional hidden layers with nonlinearities. The impact of using Nesterov's Accelerated Gradient in backpropagation is also studied. We conclude that deep neural networks are a reasonable approach for text classification and propose further ideas to improve performance.
In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This chapter reviews the recent advances in deep reinforcement learning with a focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework.
Embedding models for entities and relations are extremely useful for recovering missing facts in a knowledge base. Intuitively, a relation can be modeled by a matrix mapping entity vectors. However, relations reside on low dimension sub-manifolds in the parameter space of arbitrary matrices---for one reason, composition of two relations $\boldsymbol{M}_1,\boldsymbol{M}_2$ may match a third $\boldsymbol{M}_3$ (e.g. composition of relations currency_of_country and country_of_film usually matches currency_of_film_budget), which imposes compositional constraints to be satisfied by the parameters (i.e. $\boldsymbol{M}_1\cdot \boldsymbol{M}_2\approx \boldsymbol{M}_3$). In this paper we investigate a dimension reduction technique by training relations jointly with an autoencoder, which is expected to better capture compositional constraints. We achieve state-of-the-art on Knowledge Base Completion tasks with strongly improved Mean Rank, and show that joint training with an autoencoder leads to interpretable sparse codings of relations, helps discovering compositional constraints and benefits from compositional training. Our source code is released at github.com/tianran/glimvec.
Classifying large scale networks into several categories and distinguishing them according to their fine structures is of great importance with several applications in real life. However, most studies of complex networks focus on properties of a single network but seldom on classification, clustering, and comparison between different networks, in which the network is treated as a whole. Due to the non-Euclidean properties of the data, conventional methods can hardly be applied on networks directly. In this paper, we propose a novel framework of complex network classifier (CNC) by integrating network embedding and convolutional neural network to tackle the problem of network classification. By training the classifiers on synthetic complex network data and real international trade network data, we show CNC can not only classify networks in a high accuracy and robustness, it can also extract the features of the networks automatically.
Surgical data science is a new research field that aims to observe all aspects and factors of the patient treatment process in order to provide the right assistance to the right person at the right time. Due to the breakthrough successes of deep learning-based solutions for automatic image annotation, the availability of reference annotations for algorithm training is becoming a major bottleneck in the field. The purpose of this paper was to investigate the concept of self-supervised learning to address this issue. Our approach is guided by the hypothesis that unlabeled video data can be used to learn a representation of the target domain that boosts the performance of state-of-the-art machine learning algorithms when used for pre-training. Essentially, this method involves an auxiliary task that requires training with unlabeled endoscopic video data from the target domain to initialize a convolutional neural network (CNN) for the target task. In this paper, we propose to undertake a re-colorization of medical images with generative adversarial network (GAN)-based architecture as an auxiliary task. A variant of the method involves a second pre-training step based on labeled data for the target task from a related domain. We have validated both variants using medical instrument segmentation as the target task. The proposed approach can be used to radically reduce the manual annotation effort involved in training CNNs. Compared to the baseline approach of generating annotated data from scratch, our method decreases exploratively the number of labeled images by up to 60% without sacrificing performance. Our method also outperforms alternative methods for CNN pre-training, such as pre-training on publicly available non-medical (COCO) or medical data (MICCAI endoscopic vision challenge 2017) using the target task (in this instance: segmentation).
Unsupervised learning is of growing interest because it unlocks the potential held in vast amounts of unlabelled data to learn useful representations for inference. Autoencoders, a form of generative model, may be trained by learning to reconstruct unlabelled input data from a latent representation space. More robust representations may be produced by an autoencoder if it learns to recover clean input samples from corrupted ones. Representations may be further improved by introducing regularisation during training to shape the distribution of the encoded data in latent space. We suggest denoising adversarial autoencoders, which combine denoising and regularisation, shaping the distribution of latent space using adversarial training. We introduce a novel analysis that shows how denoising may be incorporated into the training and sampling of adversarial autoencoders. Experiments are performed to assess the contributions that denoising makes to the learning of representations for classification and sample synthesis. Our results suggest that autoencoders trained using a denoising criterion achieve higher classification performance, and can synthesise samples that are more consistent with the input data than those trained without a corruption process.