国产综合欧美日韩激情在线,一级无码免费视频,国产视频999免费在线观看

Recent network pruning methods focus on pruning models early-on in training. To estimate the impact of removing a parameter, these methods use importance measures that were originally designed to prune trained models. Despite lacking justification for their use early-on in training, such measures result in surprisingly low accuracy loss. To better explain this behavior, we develop a general framework that uses gradient flow to unify state-of-the-art importance measures through the norm of model parameters. We use this framework to determine the relationship between pruning measures and evolution of model parameters, establishing several results related to pruning models early-on in training: (i) magnitude-based pruning removes parameters that contribute least to reduction in loss, resulting in models that converge faster than magnitude-agnostic methods; (ii) loss-preservation based pruning preserves first-order model evolution dynamics and is therefore appropriate for pruning minimally trained models; and (iii) gradient-norm based pruning affects second-order model evolution dynamics, such that increasing gradient norm via pruning can produce poorly performing models. We validate our claims on several VGG-13, MobileNet-V1, and ResNet-56 models trained on CIFAR-10/CIFAR-100. Code available at //github.com/EkdeepSLubana/flowandprune.

估計/估計量 · Processing（編程語言） · 級聯 · 統計量 · 過采樣 ·

2021 年 9 月 22 日

Temporal Scale Estimation for Oversampled Network Cascades: Theory, Algorithms, and Experiment

Abram Magner,Carolyn Kaminski,Petko Bogdanov

from arxiv, 31 pages

Spreading processes on graphs arise in a host of application domains, from the study of online social networks to viral marketing to epidemiology. Various discrete-time probabilistic models for spreading processes have been proposed. These are used for downstream statistical estimation and prediction problems, often involving messages or other information that is transmitted along with infections caused by the process. It is thus important to design models of cascade observation that take into account phenomena that lead to uncertainty about the process state at any given time. We highlight one such phenomenon -- temporal distortion -- caused by a misalignment between the rate at which observations of a cascade process are made and the rate at which the process itself operates, and argue that failure to correct for it results in degradation of performance on downstream statistical tasks. To address these issues, we formulate the clock estimation problem in terms of a natural distortion measure. We give a clock estimation algorithm, which we call FastClock, that runs in linear time in the size of its input and is provably statistically accurate for a broad range of model parameters when cascades are generated from the independent cascade process with known parameters and when the underlying graph is Erd\H{o}s-R\'enyi. We further give empirical results on the performance of our algorithm in comparison to the state of the art estimator, a likelihood proxy maximization-based estimator implemented via dynamic programming. We find that, in a broad parameter regime, our algorithm substantially outperforms the dynamic programming algorithm in terms of both running time and accuracy.

Networking · 圖卷積神經網絡/圖卷積網絡 · 圖 · 圖卷積 · MoDELS ·

2021 年 2 月 28 日

Network of Tensor Time Series

Baoyu Jing,Hanghang Tong,Yada Zhu

from arxiv, Accepted by WWW'2021

Co-evolving time series appears in a multitude of applications such as environmental monitoring, financial analysis, and smart transportation. This paper aims to address the following challenges, including (C1) how to incorporate explicit relationship networks of the time series; (C2) how to model the implicit relationship of the temporal dynamics. We propose a novel model called Network of Tensor Time Series, which is comprised of two modules, including Tensor Graph Convolutional Network (TGCN) and Tensor Recurrent Neural Network (TRNN). TGCN tackles the first challenge by generalizing Graph Convolutional Network (GCN) for flat graphs to tensor graphs, which captures the synergy between multiple graphs associated with the tensors. TRNN leverages tensor decomposition to model the implicit relationships among co-evolving time series. The experimental results on five real-world datasets demonstrate the efficacy of the proposed method.

泛化理論 · 圖形處理器 · Boosting（一種模型訓練加速方式） · Neural Networks · 優化器 ·

2020 年 6 月 15 日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Kenta Oono,Taiji Suzuki

from arxiv, 9 pages, Reference 5 pages, Supplemental material 18 pages

It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as \textit{over-smoothing}. Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem. However, there is little explanation of why it works empirically from the viewpoint of learning theory. In this study, we derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs. Using the boosting theory, we prove the convergence of the training error under weak learning-type conditions. By combining it with generalization gap bounds in terms of transductive Rademacher complexity, we show that a test error bound of a specific type of multi-scale GNNs that decreases corresponding to the depth under the conditions. Our results offer theoretical explanations for the effectiveness of the multi-scale structure against the over-smoothing problem. We apply boosting algorithms to the training of multi-scale GNNs for real-world node prediction tasks. We confirm that its performance is comparable to existing GNNs, and the practical behaviors are consistent with theoretical observations. Code is available at //github.com/delta2323/GB-GNN

圖卷積神經網絡/圖卷積網絡 · 圖 · 有向 · 圖卷積 · Performer ·

2020 年 4 月 29 日

Directed Graph Convolutional Network

Zekun Tong,Yuxuan Liang,Changsheng Sun,David S. Rosenblum,Andrew Lim

Graph Convolutional Networks (GCNs) have been widely used due to their outstanding performance in processing graph-structured data. However, the undirected graphs limit their application scope. In this paper, we extend spectral-based graph convolution to directed graphs by using first- and second-order proximity, which can not only retain the connection properties of the directed graph, but also expand the receptive field of the convolution operation. A new GCN model, called DGCN, is then designed to learn representations on the directed graph, leveraging both the first- and second-order proximity information. We empirically show the fact that GCNs working only with DGCNs can encode more useful information from graph and help achieve better performance when generalized to other models. Moreover, extensive experiments on citation networks and co-purchase datasets demonstrate the superiority of our model against the state-of-the-art methods.

網絡表示學習 · Networking · 表示學習 · 學成 · 無監督 ·

2020 年 3 月 11 日

A Comparative Study for Unsupervised Network Representation Learning

Megha Khosla,Vinay Setty,Avishek Anand

from arxiv, Accepted for publication in IEEE TKDE

There has been appreciable progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. In this paper we theoretically group different approaches under a unifying framework and empirically investigate the effectiveness of different network representation methods. In particular, we argue that most of the UNRL approaches either explicitly or implicit model and exploit context information of a node. Consequently, we propose a framework that casts a variety of approaches -- random walk based, matrix factorization and deep learning based -- into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study the differences among these methods in detail which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering 9 popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks -- node classification and link prediction. We find that there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

圖注意力網絡 · 圖 · 注意力機制 · GNN · Performer ·

2019 年 12 月 6 日

Hyperbolic Graph Attention Network

Yiding Zhang,Xiao Wang,Xunqiang Jiang,Chuan Shi,Yanfang Ye

from arxiv, 8 pages, 4 figures

Graph neural network (GNN) has shown superior performance in dealing with graphs, which has attracted considerable research attention recently. However, most of the existing GNN models are primarily designed for graphs in Euclidean spaces. Recent research has proven that the graph data exhibits non-Euclidean latent anatomy. Unfortunately, there was rarely study of GNN in non-Euclidean settings so far. To bridge this gap, in this paper, we study the GNN with attention mechanism in hyperbolic spaces at the first attempt. The research of hyperbolic GNN has some unique challenges: since the hyperbolic spaces are not vector spaces, the vector operations (e.g., vector addition, subtraction, and scalar multiplication) cannot be carried. To tackle this problem, we employ the gyrovector spaces, which provide an elegant algebraic formalism for hyperbolic geometry, to transform the features in a graph; and then we propose the hyperbolic proximity based attention mechanism to aggregate the features. Moreover, as mathematical operations in hyperbolic spaces could be more complicated than those in Euclidean spaces, we further devise a novel acceleration strategy using logarithmic and exponential mappings to improve the efficiency of our proposed model. The comprehensive experimental results on four real-world datasets demonstrate the performance of our proposed hyperbolic graph attention network model, by comparisons with other state-of-the-art baseline methods.

Neural Networks · Networking · 文本分類 · 支持向量 · 向量化 ·

2019 年 10 月 28 日

A Comparison of Neural Network Training Methods for Text Classification

Anderson de Andrade

We study the impact of neural networks in text classification. Our focus is on training deep neural networks with proper weight initialization and greedy layer-wise pretraining. Results are compared with 1-layer neural networks and Support Vector Machines. We work with a dataset of labeled messages from the Twitter microblogging service and aim to predict weather conditions. A feature extraction procedure specific for the task is proposed, which applies dimensionality reduction using Latent Semantic Analysis. Our results show that neural networks outperform Support Vector Machines with Gaussian kernels, noticing performance gains from introducing additional hidden layers with nonlinearities. The impact of using Nesterov's Accelerated Gradient in backpropagation is also studied. We conclude that deep neural networks are a reasonable approach for text classification and propose further ideas to improve performance.

網絡表示學習 · Networking · 表示學習 · 學成 · 無監督 ·

2019 年 3 月 19 日

A Comprehensive Comparison of Unsupervised Network Representation Learning Methods

Megha Khosla,Avishek Anand,Vinay Setty

There has been appreciable progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. In this paper we theoretically group different approaches under a unifying framework and empirically investigate the effectiveness of different network representation methods. In particular, we argue that most of the UNRL approaches either explicitly or implicit model and exploit context information of a node. Consequently, we propose a framework that casts a variety of approaches -- random walk based, matrix factorization and deep learning based -- into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study the differences among these methods in detail which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering 9 popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks -- node classification and link prediction. We find that there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

估計/估計量 · 穩健性 · 容差 · 樣本復雜度 · TCS ·

2017 年 12 月 14 日

Being Robust (in High Dimensions) Can Be Practical

Ilias Diakonikolas,Gautam Kamath,Daniel M. Kane,Jerry Li,Ankur Moitra,Alistair Stewart

from arxiv, Appeared in ICML 2017

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.

亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

相關內容