在线亚洲91SE亚洲综合在线_在线看片日中文福利免费_最新68国产成人精品视频_欧美日韩天堂V在线视频_国产AA免费黄片视频_白丝护士小仙女自慰喷白浆_精品免费在线观看

Optimising deep neural networks is a challenging task due to complex training dynamics, high computational requirements, and long training times. To address this difficulty, we propose the framework of Generalisable Agents for Neural Network Optimisation (GANNO) -- a multi-agent reinforcement learning (MARL) approach that learns to improve neural network optimisation by dynamically and responsively scheduling hyperparameters during training. GANNO utilises an agent per layer that observes localised network dynamics and accordingly takes actions to adjust these dynamics at a layerwise level to collectively improve global performance. In this paper, we use GANNO to control the layerwise learning rate and show that the framework can yield useful and responsive schedules that are competitive with handcrafted heuristics. Furthermore, GANNO is shown to perform robustly across a wide variety of unseen initial conditions, and can successfully generalise to harder problems than it was trained on. Our work presents an overview of the opportunities that this paradigm offers for training neural networks, along with key challenges that remain to be overcome.

相關內容

Neural Networks

關注 1648

神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡（Neural Networks）是(shi)世界上三個(ge)最古老的(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)建(jian)模(mo)(mo)學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)的(de)(de)(de)(de)檔案(an)期刊:國際(ji)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(INNS)、歐洲神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(ENNS)和(he)(he)(he)日本神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(JNNS)。神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡提(ti)供了一(yi)(yi)個(ge)論壇，以發(fa)展和(he)(he)(he)培(pei)育一(yi)(yi)個(ge)國際(ji)社會(hui)(hui)的(de)(de)(de)(de)學(xue)(xue)(xue)(xue)(xue)者(zhe)和(he)(he)(he)實踐者(zhe)感(gan)興趣(qu)的(de)(de)(de)(de)所有(you)方面(mian)的(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡和(he)(he)(he)相關方法的(de)(de)(de)(de)計(ji)(ji)算(suan)(suan)智能。神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡歡迎(ying)高(gao)質量論文的(de)(de)(de)(de)提(ti)交，有(you)助于(yu)全面(mian)的(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡研究，從(cong)行為(wei)和(he)(he)(he)大腦建(jian)模(mo)(mo)，學(xue)(xue)(xue)(xue)(xue)習算(suan)(suan)法，通過(guo)數學(xue)(xue)(xue)(xue)(xue)和(he)(he)(he)計(ji)(ji)算(suan)(suan)分析，系統的(de)(de)(de)(de)工程(cheng)和(he)(he)(he)技術(shu)應(ying)用(yong)(yong)，大量使用(yong)(yong)神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡的(de)(de)(de)(de)概(gai)念(nian)和(he)(he)(he)技術(shu)。這一(yi)(yi)獨特而(er)廣泛的(de)(de)(de)(de)范圍促進了生(sheng)物和(he)(he)(he)技術(shu)研究之間的(de)(de)(de)(de)思想交流，并(bing)有(you)助于(yu)促進對生(sheng)物啟發(fa)的(de)(de)(de)(de)計(ji)(ji)算(suan)(suan)智能感(gan)興趣(qu)的(de)(de)(de)(de)跨學(xue)(xue)(xue)(xue)(xue)科社區的(de)(de)(de)(de)發(fa)展。因此，神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)網(wang)(wang)(wang)(wang)(wang)絡編(bian)委會(hui)(hui)代(dai)表(biao)的(de)(de)(de)(de)專(zhuan)家領域包(bao)括(kuo)心(xin)理(li)學(xue)(xue)(xue)(xue)(xue)，神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)生(sheng)物學(xue)(xue)(xue)(xue)(xue)，計(ji)(ji)算(suan)(suan)機科學(xue)(xue)(xue)(xue)(xue)，工程(cheng)，數學(xue)(xue)(xue)(xue)(xue)，物理(li)。該雜志發(fa)表(biao)文章(zhang)、信(xin)件(jian)(jian)和(he)(he)(he)評論以及(ji)給編(bian)輯(ji)的(de)(de)(de)(de)信(xin)件(jian)(jian)、社論、時(shi)事、軟件(jian)(jian)調查和(he)(he)(he)專(zhuan)利信(xin)息。文章(zhang)發(fa)表(biao)在五個(ge)部分之一(yi)(yi):認知科學(xue)(xue)(xue)(xue)(xue)，神(shen)(shen)(shen)(shen)(shen)(shen)經(jing)(jing)科學(xue)(xue)(xue)(xue)(xue)，學(xue)(xue)(xue)(xue)(xue)習系統，數學(xue)(xue)(xue)(xue)(xue)和(he)(he)(he)計(ji)(ji)算(suan)(suan)分析、工程(cheng)和(he)(he)(he)應(ying)用(yong)(yong)。官網(wang)(wang)(wang)(wang)(wang)地(di)址：

邊 · Learning · 可約的 · 邊緣設備 · 推斷 ·

2022 年 10 月 6 日

Enabling Deep Learning on Edge Devices

Zhongnan Qu

from arxiv, PhD thesis at ETH Zurich

Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc. The high-performed DNNs heavily rely on intensive resource consumption. For example, training a DNN requires high dynamic memory, a large-scale dataset, and a large number of computations (a long training time); even inference with a DNN also demands a large amount of static storage, computations (a long inference time), and energy. Therefore, state-of-the-art DNNs are often deployed on a cloud server with a large number of super-computers, a high-bandwidth communication bus, a shared storage infrastructure, and a high power supplement. Recently, some new emerging intelligent applications, e.g., AR/VR, mobile assistants, Internet of Things, require us to deploy DNNs on resource-constrained edge devices. Compare to a cloud server, edge devices often have a rather small amount of resources. To deploy DNNs on edge devices, we need to reduce the size of DNNs, i.e., we target a better trade-off between resource consumption and model accuracy. In this dissertation, we studied four edge intelligence scenarios, i.e., Inference on Edge Devices, Adaptation on Edge Devices, Learning on Edge Devices, and Edge-Server Systems, and developed different methodologies to enable deep learning in each scenario. Since current DNNs are often over-parameterized, our goal is to find and reduce the redundancy of the DNNs in each scenario.

多任務學習 · 學成 · 可理解性 · INFORMS · 泛化理論 ·

2022 年 3 月 28 日

Multi-Task Learning for Visual Scene Understanding

Simon Vandenhende

from arxiv, PhD Thesis

Despite the recent progress in deep learning, most approaches still go for a silo-like solution, focusing on learning each task in isolation: training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. Multi-task learning (MTL) aims to leverage useful information across tasks to improve the generalization capability of a model. This thesis is concerned with multi-task learning in the context of computer vision. First, we review existing approaches for MTL. Next, we propose several methods that tackle important aspects of multi-task learning. The proposed methods are evaluated on various benchmarks. The results show several advances in the state-of-the-art of multi-task learning. Finally, we discuss several possibilities for future work.

MoDELS · 語言模型化 · 任務對話系統 · 話題 · Vision ·

2022 年 3 月 26 日

A Roadmap for Big Model

Sha Yuan,Hanyu Zhao,Shuai Zhao,Jiahong Leng,Yangxiao Liang,Xiaozhi Wang,Jifan Yu,Xin Lv,Zhou Shao,Jiaao He,Yankai Lin,Xu Han,Zhenghao Liu,Ning Ding,Yongming Rao,Yizhao Gao,Liang Zhang,Ming Ding,Cong Fang,Yisen Wang,Mingsheng Long,Jing Zhang,Yinpeng Dong,Tianyu Pang,Peng Cui,Lingxiao Huang,Zheng Liang,Huawei Shen,Hui Zhang,Quanshi Zhang,Qingxiu Dong,Zhixing Tan,Mingxuan Wang,Shuo Wang,Long Zhou,Haoran Li,Junwei Bao,Yingwei Pan,Weinan Zhang,Zhou Yu,Rui Yan,Chence Shi,Minghao Xu,Zuobai Zhang,Guoqiang Wang,Xiang Pan,Mengjie Li,Xiaoyu Chu,Zijun Yao,Fangwei Zhu,Shulin Cao,Weicheng Xue,Zixuan Ma,Zhengyan Zhang,Shengding Hu,Yujia Qin,Chaojun Xiao,Zheni Zeng,Ganqu Cui,Weize Chen,Weilin Zhao,Yuan Yao,Peng Li,Wenzhao Zheng,Wenliang Zhao,Ziyi Wang,Borui Zhang,Nanyi Fei,Anwen Hu,Zenan Ling,Haoyang Li,Boxi Cao,Xianpei Han,Weidong Zhan,Baobao Chang,Hao Sun,Jiawen Deng,Juanzi Li,Lei Hou,Xigang Cao,Jidong Zhai,Zhiyuan Liu,Maosong Sun,Jiwen Lu,Zhiwu Lu,Qin Jin,Ruihua Song,Ji-Rong Wen,Zhouchen Lin,Liwei Wang,Hang Su,Jun Zhu,Zhifang Sui,Jiajun Zhang,Yang Liu,Xiaodong He,Minlie Huang,Jian Tang,Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

圖 · 學成 · MoDELS · Extensibility · 深度學習 ·

2022 年 2 月 24 日

Bayesian Deep Learning for Graphs

Federico Errica

from arxiv, PhD Thesis

The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.

學成 · 分子建模 · 深度學習 · Neural Networks · Processing（編程語言） ·

2021 年 7 月 26 日

Geometric Deep Learning on Molecular Representations

Kenneth Atz,Francesca Grisoni,Gisbert Schneider

Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.

超參數 · Performer · Weight · 集成 · 穩健性 ·

2020 年 6 月 24 日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Florian Wenzel,Jasper Snoek,Dustin Tran,Rodolphe Jenatton

Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.

邊 · AIM · Processing（編程語言） · 可辨認的 · Taxonomy ·

2020 年 3 月 26 日

A Survey on Edge Intelligence

Dianlei Xu,Tong Li,Yong Li,Xiang Su,Sasu Tarkoma,Pan Hui

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.

entity · 圖 · 知識圖譜 · MoDELS · 相似度 ·

2019 年 9 月 11 日

Domain Representation for Knowledge Graph Embedding

Cunxiang Wang,Feiliang Ren,Zhichao Lin,Chenxv Zhao,Tian Xie,Yue Zhang

from arxiv, Acceptted by NLPCC2019

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

GANs · 遷移學習 · 可辨認的 · 能量函數 · 學成 ·

2018 年 12 月 6 日

Adversarial Transfer Learning

Garrett Wilson,Diane J. Cook

There is a recent large and growing interest in generative adversarial networks (GANs), which offer powerful features for generative modeling, density estimation, and energy function learning. GANs are difficult to train and evaluate but are capable of creating amazingly realistic, though synthetic, image data. Ideas stemming from GANs such as adversarial losses are creating research opportunities for other challenges such as domain adaptation. In this paper, we look at the field of GANs with emphasis on these areas of emerging research. To provide background for adversarial techniques, we survey the field of GANs, looking at the original formulation, training variants, evaluation methods, and extensions. Then we survey recent work on transfer learning, focusing on comparing different adversarial domain adaptation methods. Finally, we take a look forward to identify open research directions for GANs and domain adaptation, including some promising applications such as sensor-based human behavior modeling.