亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A well-known perceptual consequence of categorization in humans and other animals, called categorical perception, is notably characterized by a within-category compression and a between-category separation: two items, close in input space, are perceived closer if they belong to the same category than if they belong to different categories. Elaborating on experimental and theoretical results in cognitive science, here we study categorical effects in artificial neural networks. We combine a theoretical analysis that makes use of mutual and Fisher information quantities, and a series of numerical simulations on networks of increasing complexity. These formal and numerical analyses provide insights into the geometry of the neural representation in deep layers, with expansion of space near category boundaries and contraction far from category boundaries. We investigate categorical representation by using two complementary approaches: one mimics experiments in psychophysics and cognitive neuroscience by means of morphed continua between stimuli of different categories, while the other introduces a categoricality index that, for each layer in the network, quantifies the separability of the categories at the neural population level. We show on both shallow and deep neural networks that category learning automatically induces categorical perception. We further show that the deeper a layer, the stronger the categorical effects. As an outcome of our study, we propose a coherent view of the efficacy of different heuristic practices of the dropout regularization technique. More generally, our view, which finds echoes in the neuroscience literature, insists on the differential impact of noise in any given layer depending on the geometry of the neural representation that is being learned, i.e. on how this geometry reflects the structure of the categories.

相關內容

In recent times, a large number of people have been involved in establishing their own businesses. Unlike humans, chatbots can serve multiple customers at a time, are available 24/7 and reply in less than a fraction of a second. Though chatbots perform well in task-oriented activities, in most cases they fail to understand personalized opinions, statements or even queries which later impact the organization for poor service management. Lack of understanding capabilities in bots disinterest humans to continue conversations with them. Usually, chatbots give absurd responses when they are unable to interpret a user's text accurately. Extracting the client reviews from conversations by using chatbots, organizations can reduce the major gap of understanding between the users and the chatbot and improve their quality of products and services.Thus, in our research we incorporated all the key elements that are necessary for a chatbot to analyse and understand an input text precisely and accurately. We performed sentiment analysis, emotion detection, intent classification and named-entity recognition using deep learning to develop chatbots with humanistic understanding and intelligence. The efficiency of our approach can be demonstrated accordingly by the detailed analysis.

Despite pre-trained language models have proven useful for learning high-quality semantic representations, these models are still vulnerable to simple perturbations. Recent works aimed to improve the robustness of pre-trained models mainly focus on adversarial training from perturbed examples with similar semantics, neglecting the utilization of different or even opposite semantics. Different from the image processing field, the text is discrete and few word substitutions can cause significant semantic changes. To study the impact of semantics caused by small perturbations, we conduct a series of pilot experiments and surprisingly find that adversarial training is useless or even harmful for the model to detect these semantic changes. To address this problem, we propose Contrastive Learning with semantIc Negative Examples (CLINE), which constructs semantic negative examples unsupervised to improve the robustness under semantically adversarial attacking. By comparing with similar and opposite semantic examples, the model can effectively perceive the semantic changes caused by small perturbations. Empirical results show that our approach yields substantial improvements on a range of sentiment analysis, reasoning, and reading comprehension tasks. And CLINE also ensures the compactness within the same semantics and separability across different semantics in sentence-level.

A community reveals the features and connections of its members that are different from those in other communities in a network. Detecting communities is of great significance in network analysis. Despite the classical spectral clustering and statistical inference methods, we notice a significant development of deep learning techniques for community detection in recent years with their advantages in handling high dimensional network data. Hence, a comprehensive overview of community detection's latest progress through deep learning is timely to both academics and practitioners. This survey devises and proposes a new taxonomy covering different categories of the state-of-the-art methods, including deep learning-based models upon deep neural networks, deep nonnegative matrix factorization and deep sparse filtering. The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders. The survey also summarizes the popular benchmark data sets, model evaluation metrics, and open-source implementations to address experimentation settings. We then discuss the practical applications of community detection in various domains and point to implementation scenarios. Finally, we outline future directions by suggesting challenging topics in this fast-growing deep learning field.

In humans, Attention is a core property of all perceptual and cognitive operations. Given our limited ability to process competing sources, attention mechanisms select, modulate, and focus on the information most relevant to behavior. For decades, concepts and functions of attention have been studied in philosophy, psychology, neuroscience, and computing. For the last six years, this property has been widely explored in deep neural networks. Currently, the state-of-the-art in Deep Learning is represented by neural attention models in several application domains. This survey provides a comprehensive overview and analysis of developments in neural attention models. We systematically reviewed hundreds of architectures in the area, identifying and discussing those in which attention has shown a significant impact. We also developed and made public an automated methodology to facilitate the development of reviews in the area. By critically analyzing 650 works, we describe the primary uses of attention in convolutional, recurrent networks and generative models, identifying common subgroups of uses and applications. Furthermore, we describe the impact of attention in different application domains and their impact on neural networks' interpretability. Finally, we list possible trends and opportunities for further research, hoping that this review will provide a succinct overview of the main attentional models in the area and guide researchers in developing future approaches that will drive further improvements.

This paper serves as a survey of recent advances in large margin training and its theoretical foundations, mostly for (nonlinear) deep neural networks (DNNs) that are probably the most prominent machine learning models for large-scale data in the community over the past decade. We generalize the formulation of classification margins from classical research to latest DNNs, summarize theoretical connections between the margin, network generalization, and robustness, and introduce recent efforts in enlarging the margins for DNNs comprehensively. Since the viewpoint of different methods is discrepant, we categorize them into groups for ease of comparison and discussion in the paper. Hopefully, our discussions and overview inspire new research work in the community that aim to improve the performance of DNNs, and we also point to directions where the large margin principle can be verified to provide theoretical evidence why certain regularizations for DNNs function well in practice. We managed to shorten the paper such that the crucial spirit of large margin learning and related methods are better emphasized.

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

Deep learning has been shown successful in a number of domains, ranging from acoustics, images to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, a significant amount of research efforts have been devoted to this area, greatly advancing graph analyzing techniques. In this survey, we comprehensively review different kinds of deep learning methods applied to graphs. We divide existing methods into three main categories: semi-supervised methods including Graph Neural Networks and Graph Convolutional Networks, unsupervised methods including Graph Autoencoders, and recent advancements including Graph Recurrent Neural Networks and Graph Reinforcement Learning. We then provide a comprehensive overview of these methods in a systematic manner following their history of developments. We also analyze the differences of these methods and how to composite different architectures. Finally, we briefly outline their applications and discuss potential future directions.

Transfer learning is one of the subjects undergoing intense study in the area of machine learning. In object recognition and object detection there are known experiments for the transferability of parameters, but not for neural networks which are suitable for object-detection in real time embedded applications, such as the SqueezeDet neural network. We use transfer learning to accelerate the training of SqueezeDet to a new group of classes. Also, experiments are conducted to study the transferability and co-adaptation phenomena introduced by the transfer learning process. To accelerate training, we propose a new implementation of the SqueezeDet training which provides a faster pipeline for data processing and achieves $1.8$ times speedup compared to the initial implementation. Finally, we created a mechanism for automatic hyperparamer optimization using an empirical method.

Deep learning (DL) is a high dimensional data reduction technique for constructing high-dimensional predictors in input-output models. DL is a form of machine learning that uses hierarchical layers of latent features. In this article, we review the state-of-the-art of deep learning from a modeling and algorithmic perspective. We provide a list of successful areas of applications in Artificial Intelligence (AI), Image Processing, Robotics and Automation. Deep learning is predictive in its nature rather then inferential and can be viewed as a black-box methodology for high-dimensional function estimation.

Deep Convolutional Neural Networks have pushed the state-of-the art for semantic segmentation provided that a large amount of images together with pixel-wise annotations is available. Data collection is expensive and a solution to alleviate it is to use transfer learning. This reduces the amount of annotated data required for the network training but it does not get rid of this heavy processing step. We propose a method of transfer learning without annotations on the target task for datasets with redundant content and distinct pixel distributions. Our method takes advantage of the approximate content alignment of the images between two datasets when the approximation error prevents the reuse of annotation from one dataset to another. Given the annotations for only one dataset, we train a first network in a supervised manner. This network autonomously learns to generate deep data representations relevant to the semantic segmentation. Then the images in the new dataset, we train a new network to generate a deep data representation that matches the one from the first network on the previous dataset. The training consists in a regression between feature maps and does not require any annotations on the new dataset. We show that this method reaches performances similar to a classic transfer learning on the PASCAL VOC dataset with synthetic transformations.

北京阿比特科技有限公司