Accurate channel modeling is the foundation of communication system design. However, the traditional measurement-based modeling approach has increasing challenges for the scenarios with insufficient measurement data. To obtain enough data for channel modeling, the Artificial Neural Network (ANN) is used in this paper to predict channel data. The high mobility railway channel is considered, which is a typical scenario where it is challenging to obtain enough data for modeling within a short sampling interval. Three types of ANNs, the Back Propagation Network, Radial Basis Function Neural Network and Extreme Learning Machine, are considered to predict channel path loss and shadow fading. The Root-Mean-Square error is used to evaluate prediction accuracy. The factors that may influence prediction accuracy are compared and discussed, including the type of network, number of neurons and proportion of training data. It is found that a larger number of neurons can significantly reduce prediction error, whereas the influence of proportion of training data is relatively small. The results can be used to improve modeling accuracy of path loss and shadow fading when measurement data is reduced.
The prediction for information diffusion on social networks has great practical significance in marketing and public opinion control. It aims to predict the individuals who will potentially repost the message on the social network. One type of method is based on demographics, complex networks and other prior knowledge to establish an interpretable model to simulate and predict the propagation process, while the other type of method is completely data-driven and maps the nodes to a latent space for propagation prediction. Existing latent space design and embedding methods lack consideration for the intervene among users. In this paper, we propose an independent asymmetric embedding method to embed each individual into one latent influence space and multiple latent susceptibility spaces. Based on the similarity between information diffusion and heat diffusion phenomenon, the heat diffusion kernel is exploited in our model and establishes the embedding rules. Furthermore, our method captures the co-occurrence regulation of user combinations in cascades to improve the calculating effectiveness. The results of extensive experiments conducted on real-world datasets verify both the predictive accuracy and cost-effectiveness of our approach.
In modern wireless communication systems, radio propagation modeling to estimate pathloss has always been a fundamental task in system design and optimization. The state-of-the-art empirical propagation models are based on measurements in specific environments and limited in their ability to capture idiosyncrasies of various propagation environments. To cope with this problem, ray-tracing based solutions are used in commercial planning tools, but they tend to be extremely time-consuming and expensive. We propose a Machine Learning (ML)-based model that leverages novel key predictors for estimating pathloss. By quantitatively evaluating the ability of various ML algorithms in terms of predictive, generalization and computational performance, our results show that Light Gradient Boosting Machine (LightGBM) algorithm overall outperforms others, even with sparse training data, by providing a 65% increase in prediction accuracy as compared to empirical models and 13x decrease in prediction time as compared to ray-tracing. To address the interpretability challenge that thwarts the adoption of most ML-based models, we perform extensive secondary analysis using SHapley Additive exPlanations (SHAP) method, yielding many practically useful insights that can be leveraged for intelligently tuning the network configuration, selective enrichment of training data in real networks and for building lighter ML-based propagation model to enable low-latency use-cases.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at //github.com/BayesWatch/nas-without-training.
Social relations are often used to improve recommendation quality when user-item interaction data is sparse in recommender systems. Most existing social recommendation models exploit pairwise relations to mine potential user preferences. However, real-life interactions among users are very complicated and user relations can be high-order. Hypergraph provides a natural way to model complex high-order relations, while its potentials for improving social recommendation are under-explored. In this paper, we fill this gap and propose a multi-channel hypergraph convolutional network to enhance social recommendation by leveraging high-order user relations. Technically, each channel in the network encodes a hypergraph that depicts a common high-order user relation pattern via hypergraph convolution. By aggregating the embeddings learned through multiple channels, we obtain comprehensive user representations to generate recommendation results. However, the aggregation operation might also obscure the inherent characteristics of different types of high-order connectivity information. To compensate for the aggregating loss, we innovatively integrate self-supervised learning into the training of the hypergraph convolutional network to regain the connectivity information with hierarchical mutual information maximization. The experimental results on multiple real-world datasets show that the proposed model outperforms the SOTA methods, and the ablation study verifies the effectiveness of the multi-channel setting and the self-supervised task. The implementation of our model is available via //github.com/Coder-Yu/RecQ.
To improve the search efficiency for Neural Architecture Search (NAS), One-shot NAS proposes to train a single super-net to approximate the performance of proposal architectures during search via weight-sharing. While this greatly reduces the computation cost, due to approximation error, the performance prediction by a single super-net is less accurate than training each proposal architecture from scratch, leading to search inefficiency. In this work, we propose few-shot NAS that explores the choice of using multiple super-nets: each super-net is pre-trained to be in charge of a sub-region of the search space. This reduces the prediction error of each super-net. Moreover, training these super-nets can be done jointly via sequential fine-tuning. A natural choice of sub-region is to follow the splitting of search space in NAS. We empirically evaluate our approach on three different tasks in NAS-Bench-201. Extensive results have demonstrated that few-shot NAS, using only 5 super-nets, significantly improves performance of many search methods with slight increase of search time. The architectures found by DARTs and ENAS with few-shot models achieved 88.53% and 86.50% test accuracy on CIFAR-10 in NAS-Bench-201, significantly outperformed their one-shot counterparts (with 54.30% and 54.30% test accuracy). Moreover, on AUTOGAN and DARTS, few-shot NAS also outperforms previously state-of-the-art models.
Recurrent neural networks (RNNs) provide state-of-the-art performance in processing sequential data but are memory intensive to train, limiting the flexibility of RNN models which can be trained. Reversible RNNs---RNNs for which the hidden-to-hidden transition can be reversed---offer a path to reduce the memory requirements of training, as hidden states need not be stored and instead can be recomputed during backpropagation. We first show that perfectly reversible RNNs, which require no storage of the hidden activations, are fundamentally limited because they cannot forget information from their hidden state. We then provide a scheme for storing a small number of bits in order to allow perfect reversal with forgetting. Our method achieves comparable performance to traditional models while reducing the activation memory cost by a factor of 10--15. We extend our technique to attention-based sequence-to-sequence models, where it maintains performance while reducing activation memory cost by a factor of 5--10 in the encoder, and a factor of 10--15 in the decoder.
Knowledge graphs are large graph-structured databases of facts, which typically suffer from incompleteness. Link prediction is the task of inferring missing relations (links) between entities (nodes) in a knowledge graph. We propose to solve this task by using a hypernetwork architecture to generate convolutional layer filters specific to each relation and apply those filters to the subject entity embeddings. This architecture enables a trade-off between non-linear expressiveness and the number of parameters to learn. Our model simplifies the entity and relation embedding interactions introduced by the predecessor convolutional model, while outperforming all previous approaches to link prediction across all standard link prediction datasets.
The emerging technique of deep learning has been widely applied in many different areas. However, when adopted in a certain specific domain, this technique should be combined with domain knowledge to improve efficiency and accuracy. In particular, when analyzing the applications of deep learning in sentiment analysis, we found that the current approaches are suffering from the following drawbacks: (i) the existing works have not paid much attention to the importance of different types of sentiment terms, which is an important concept in this area; and (ii) the loss function currently employed does not well reflect the degree of error of sentiment misclassification. To overcome such problem, we propose to combine domain knowledge with deep learning. Our proposal includes using sentiment scores, learnt by regression, to augment training data; and introducing penalty matrix for enhancing the loss function of cross entropy. When experimented, we achieved a significant improvement in classification results.
We report an evaluation of the effectiveness of the existing knowledge base embedding models for relation prediction and for relation extraction on a wide range of benchmarks. We also describe a new benchmark, which is much larger and complex than previous ones, which we introduce to help validate the effectiveness of both tasks. The results demonstrate that knowledge base embedding models are generally effective for relation prediction but unable to give improvements for the state-of-art neural relation extraction model with the existing strategies, while pointing limitations of existing methods.