亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Optimising deep neural networks is a challenging task due to complex training dynamics, high computational requirements, and long training times. To address this difficulty, we propose the framework of Generalisable Agents for Neural Network Optimisation (GANNO) -- a multi-agent reinforcement learning (MARL) approach that learns to improve neural network optimisation by dynamically and responsively scheduling hyperparameters during training. GANNO utilises an agent per layer that observes localised network dynamics and accordingly takes actions to adjust these dynamics at a layerwise level to collectively improve global performance. In this paper, we use GANNO to control the layerwise learning rate and show that the framework can yield useful and responsive schedules that are competitive with handcrafted heuristics. Furthermore, GANNO is shown to perform robustly across a wide variety of unseen initial conditions, and can successfully generalise to harder problems than it was trained on. Our work presents an overview of the opportunities that this paradigm offers for training neural networks, along with key challenges that remain to be overcome.

相關內容

神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡(Neural Networks)是(shi)世界上三個(ge)最古老的(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)建(jian)模(mo)(mo)學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)的(de)(de)(de)(de)檔案(an)期刊:國際(ji)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(INNS)、歐洲神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(ENNS)和(he)(he)(he)日本神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(JNNS)。神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡提(ti)供了一(yi)(yi)個(ge)論壇,以發(fa)展和(he)(he)(he)培(pei)育一(yi)(yi)個(ge)國際(ji)社會(hui)(hui)的(de)(de)(de)(de)學(xue)(xue)(xue)(xue)(xue)者(zhe)和(he)(he)(he)實踐者(zhe)感(gan)興趣(qu)的(de)(de)(de)(de)所有(you)方面(mian)的(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡和(he)(he)(he)相關方法的(de)(de)(de)(de)計(ji)(ji)算(suan)(suan)智能。神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡歡迎(ying)高(gao)質量論文的(de)(de)(de)(de)提(ti)交,有(you)助于(yu)全面(mian)的(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡研究,從(cong)行為(wei)和(he)(he)(he)大腦建(jian)模(mo)(mo),學(xue)(xue)(xue)(xue)(xue)習算(suan)(suan)法,通過(guo)數學(xue)(xue)(xue)(xue)(xue)和(he)(he)(he)計(ji)(ji)算(suan)(suan)分析,系統的(de)(de)(de)(de)工程(cheng)和(he)(he)(he)技術(shu)應(ying)用(yong)(yong),大量使用(yong)(yong)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡的(de)(de)(de)(de)概(gai)念(nian)和(he)(he)(he)技術(shu)。這一(yi)(yi)獨特而(er)廣泛的(de)(de)(de)(de)范圍促進了生(sheng)物和(he)(he)(he)技術(shu)研究之間的(de)(de)(de)(de)思想交流,并(bing)有(you)助于(yu)促進對生(sheng)物啟發(fa)的(de)(de)(de)(de)計(ji)(ji)算(suan)(suan)智能感(gan)興趣(qu)的(de)(de)(de)(de)跨學(xue)(xue)(xue)(xue)(xue)科社區的(de)(de)(de)(de)發(fa)展。因此,神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡編(bian)委會(hui)(hui)代(dai)表(biao)的(de)(de)(de)(de)專(zhuan)家領域包(bao)括(kuo)心(xin)理(li)學(xue)(xue)(xue)(xue)(xue),神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)生(sheng)物學(xue)(xue)(xue)(xue)(xue),計(ji)(ji)算(suan)(suan)機科學(xue)(xue)(xue)(xue)(xue),工程(cheng),數學(xue)(xue)(xue)(xue)(xue),物理(li)。該雜志發(fa)表(biao)文章(zhang)、信(xin)件(jian)(jian)和(he)(he)(he)評論以及(ji)給編(bian)輯(ji)的(de)(de)(de)(de)信(xin)件(jian)(jian)、社論、時(shi)事、軟件(jian)(jian)調查和(he)(he)(he)專(zhuan)利信(xin)息。文章(zhang)發(fa)表(biao)在五個(ge)部分之一(yi)(yi):認知科學(xue)(xue)(xue)(xue)(xue),神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)科學(xue)(xue)(xue)(xue)(xue),學(xue)(xue)(xue)(xue)(xue)習系統,數學(xue)(xue)(xue)(xue)(xue)和(he)(he)(he)計(ji)(ji)算(suan)(suan)分析、工程(cheng)和(he)(he)(he)應(ying)用(yong)(yong)。 官網(wang)(wang)(wang)(wang)(wang)地(di)址:

Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc. The high-performed DNNs heavily rely on intensive resource consumption. For example, training a DNN requires high dynamic memory, a large-scale dataset, and a large number of computations (a long training time); even inference with a DNN also demands a large amount of static storage, computations (a long inference time), and energy. Therefore, state-of-the-art DNNs are often deployed on a cloud server with a large number of super-computers, a high-bandwidth communication bus, a shared storage infrastructure, and a high power supplement. Recently, some new emerging intelligent applications, e.g., AR/VR, mobile assistants, Internet of Things, require us to deploy DNNs on resource-constrained edge devices. Compare to a cloud server, edge devices often have a rather small amount of resources. To deploy DNNs on edge devices, we need to reduce the size of DNNs, i.e., we target a better trade-off between resource consumption and model accuracy. In this dissertation, we studied four edge intelligence scenarios, i.e., Inference on Edge Devices, Adaptation on Edge Devices, Learning on Edge Devices, and Edge-Server Systems, and developed different methodologies to enable deep learning in each scenario. Since current DNNs are often over-parameterized, our goal is to find and reduce the redundancy of the DNNs in each scenario.

Despite the recent progress in deep learning, most approaches still go for a silo-like solution, focusing on learning each task in isolation: training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. Multi-task learning (MTL) aims to leverage useful information across tasks to improve the generalization capability of a model. This thesis is concerned with multi-task learning in the context of computer vision. First, we review existing approaches for MTL. Next, we propose several methods that tackle important aspects of multi-task learning. The proposed methods are evaluated on various benchmarks. The results show several advances in the state-of-the-art of multi-task learning. Finally, we discuss several possibilities for future work.

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.

Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.

Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

There is a recent large and growing interest in generative adversarial networks (GANs), which offer powerful features for generative modeling, density estimation, and energy function learning. GANs are difficult to train and evaluate but are capable of creating amazingly realistic, though synthetic, image data. Ideas stemming from GANs such as adversarial losses are creating research opportunities for other challenges such as domain adaptation. In this paper, we look at the field of GANs with emphasis on these areas of emerging research. To provide background for adversarial techniques, we survey the field of GANs, looking at the original formulation, training variants, evaluation methods, and extensions. Then we survey recent work on transfer learning, focusing on comparing different adversarial domain adaptation methods. Finally, we take a look forward to identify open research directions for GANs and domain adaptation, including some promising applications such as sensor-based human behavior modeling.

北京阿比特科技有限公司