欧美狂野视频一区国产精品-国产一级一区二区三区四区

Over the past few years, there has been growing interest in developing larger and deeper neural networks, including deep generative models like generative adversarial networks (GANs). However, GANs typically come with high computational complexity, leading researchers to explore methods for reducing the training and inference costs. One such approach gaining popularity in supervised learning is dynamic sparse training (DST), which maintains good performance while enjoying excellent training efficiency. Despite its potential benefits, applying DST to GANs presents challenges due to the adversarial nature of the training process. In this paper, we propose a novel metric called the balance ratio (BR) to study the balance between the sparse generator and discriminator. We also introduce a new method called balanced dynamic sparse training (ADAPT), which seeks to control the BR during GAN training to achieve a good trade-off between performance and computational cost. Our proposed method shows promising results on multiple datasets, demonstrating its effectiveness.

相關內容

GANs

關注 5

MoDELS · 語言模型化 · 任務對話系統 · 話題 · Vision ·

2022 年 3 月 26 日

A Roadmap for Big Model

Sha Yuan,Hanyu Zhao,Shuai Zhao,Jiahong Leng,Yangxiao Liang,Xiaozhi Wang,Jifan Yu,Xin Lv,Zhou Shao,Jiaao He,Yankai Lin,Xu Han,Zhenghao Liu,Ning Ding,Yongming Rao,Yizhao Gao,Liang Zhang,Ming Ding,Cong Fang,Yisen Wang,Mingsheng Long,Jing Zhang,Yinpeng Dong,Tianyu Pang,Peng Cui,Lingxiao Huang,Zheng Liang,Huawei Shen,Hui Zhang,Quanshi Zhang,Qingxiu Dong,Zhixing Tan,Mingxuan Wang,Shuo Wang,Long Zhou,Haoran Li,Junwei Bao,Yingwei Pan,Weinan Zhang,Zhou Yu,Rui Yan,Chence Shi,Minghao Xu,Zuobai Zhang,Guoqiang Wang,Xiang Pan,Mengjie Li,Xiaoyu Chu,Zijun Yao,Fangwei Zhu,Shulin Cao,Weicheng Xue,Zixuan Ma,Zhengyan Zhang,Shengding Hu,Yujia Qin,Chaojun Xiao,Zheni Zeng,Ganqu Cui,Weize Chen,Weilin Zhao,Yuan Yao,Peng Li,Wenzhao Zheng,Wenliang Zhao,Ziyi Wang,Borui Zhang,Nanyi Fei,Anwen Hu,Zenan Ling,Haoyang Li,Boxi Cao,Xianpei Han,Weidong Zhan,Baobao Chang,Hao Sun,Jiawen Deng,Juanzi Li,Lei Hou,Xigang Cao,Jidong Zhai,Zhiyuan Liu,Maosong Sun,Jiwen Lu,Zhiwu Lu,Qin Jin,Ruihua Song,Ji-Rong Wen,Zhouchen Lin,Liwei Wang,Hang Su,Jun Zhu,Zhifang Sui,Jiajun Zhang,Yang Liu,Xiaodong He,Minlie Huang,Jian Tang,Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

學成 · 分子建模 · 深度學習 · Neural Networks · Processing（編程語言） ·

2021 年 7 月 26 日

Geometric Deep Learning on Molecular Representations

Kenneth Atz,Francesca Grisoni,Gisbert Schneider

Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.

多峰值 · 模態 · INFORMS · MoDELS · 可約的 ·

2021 年 6 月 30 日

Attention Bottlenecks for Multimodal Fusion

Arsha Nagrani,Shan Yang,Anurag Arnab,Aren Jansen,Cordelia Schmid,Chen Sun

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.

Weight · 推斷 · 近似 · 貝葉斯推斷 · Performer ·

2021 年 2 月 18 日

Bayesian Deep Learning via Subnetwork Inference

Erik Daxberger,Eric Nalisnick,James Urquhart Allingham,Javier Antorán,José Miguel Hernández-Lobato

from arxiv, 21 pages, extended version with supplementary material

The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork. We propose a subnetwork selection strategy that aims to maximally preserve the model's predictive uncertainty. Empirically, our approach is effective compared to ensembles and less expressive posterior approximations over full networks.

變換 · Vision · Networking · Transformer · Processing（編程語言） ·

2020 年 12 月 23 日

A Survey on Visual Transformer

Kai Han,Yunhe Wang,Hanting Chen,Xinghao Chen,Jianyuan Guo,Zhenhua Liu,Yehui Tang,An Xiao,Chunjing Xu,Yixing Xu,Zhaohui Yang,Yiman Zhang,Dacheng Tao

Transformer is a type of deep neural network mainly based on self-attention mechanism which is originally applied in natural language processing field. Inspired by the strong representation ability of transformer, researchers propose to extend transformer for computer vision tasks. Transformer-based models show competitive and even better performance on various visual benchmarks compared to other network types such as convolutional networks and recurrent networks. In this paper we provide a literature review of these visual transformer models by categorizing them in different tasks and analyze the advantages and disadvantages of these methods. In particular, the main categories include the basic image classification, high-level vision, low-level vision and video processing. Self-attention in computer vision is also briefly revisited as self-attention is the base component in transformer. Efficient transformer methods are included for pushing transformer into real applications. Finally, we give a discussion about the further research directions for visual transformer.

控制器 · INTERACT · state-of-the-art · 模型評估 · Next ·

2020 年 8 月 3 日

Controllable Multi-Interest Framework for Recommendation

Yukuo Cen,Jianwei Zhang,Xu Zou,Chang Zhou,Hongxia Yang,Jie Tang

from arxiv, Accepted to KDD 2020

Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize the recommender system as a sequential recommendation problem, intending to predict the next items that the user might be interacted with. Recent works usually give an overall embedding from a user's behavior sequence. However, a unified user embedding cannot reflect the user's multiple interests during a period. In this paper, we propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our multi-interest module captures multiple interests from user behavior sequences, which can be exploited for retrieving candidate items from the large-scale item pool. These items are then fed into an aggregation module to obtain the overall recommendation. The aggregation module leverages a controllable factor to balance the recommendation accuracy and diversity. We conduct experiments for the sequential recommendation on two real-world datasets, Amazon and Taobao. Experimental results demonstrate that our framework achieves significant improvements over state-of-the-art models. Our framework has also been successfully deployed on the offline Alibaba distributed cloud platform.

Machine Learning · 學成 · Judea Pearl · 可理解性 · AI ·

2019 年 11 月 24 日

Causality for Machine Learning

Bernhard Sch?lkopf

Graphical causal inference as pioneered by Judea Pearl arose from research on artificial intelligence (AI), and for a long time had little connection to the field of machine learning. This article discusses where links have been and should be established, introducing key concepts along the way. It argues that the hard open problems of machine learning and AI are intrinsically related to causality, and explains how the field is beginning to understand them.

entity · 圖 · 知識圖譜 · MoDELS · 相似度 ·

2019 年 9 月 11 日

Domain Representation for Knowledge Graph Embedding

Cunxiang Wang,Feiliang Ren,Zhichao Lin,Chenxv Zhao,Tian Xie,Yue Zhang

from arxiv, Acceptted by NLPCC2019

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

目標檢測 · 模型評估 · 學成 · 注意力機制 · Networking ·

2019 年 4 月 15 日

Reverse Attention for Salient Object Detection

Shuhan Chen,Xiuli Tan,Ben Wang,Xuelong Hu

from arxiv, ECCV 2018

Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.