Words in a natural language not only transmit information but also evolve with the development of civilization and human migration. The same is true for music. To understand the complex structure behind the music, we introduced an algorithm called the Essential Element Network (EEN) to encode the audio into text. The network is obtained by calculating the correlations between scales, time, and volume. Optimizing EEN to generate Zipfs law for the frequency and rank of the clustering coefficient enables us to generate and regard the semantic relationships as words. We map these encoded words into the scale-temporal space, which helps us organize systematically the syntax in the deep structure of music. Our algorithm provides precise descriptions of the complex network behind the music, as opposed to the black-box nature of other deep learning approaches. As a result, the experience and properties accumulated through these processes can offer not only a new approach to the applications of Natural Language Processing (NLP) but also an easier and more objective way to analyze the evolution and development of music.
The integration of a near-space information network (NSIN) with the reconfigurable intelligent surface (RIS) is envisioned to significantly enhance the communication performance of future wireless communication systems by proactively altering wireless channels. This paper investigates the problem of deploying a RIS-integrated NSIN to provide energy-efficient, ultra-reliable and low-latency communications (URLLC) services. We mathematically formulate this problem as a resource optimization problem, aiming to maximize the effective throughput and minimize the system power consumption, subject to URLLC and physical resource constraints. The formulated problem is challenging in terms of accurate channel estimation, RIS phase alignment, theoretical analysis, and effective solution. We propose a joint resource allocation algorithm to handle these challenges. In this algorithm, we develop an accurate channel estimation approach by exploring message passing and optimize phase shifts of RIS reflecting elements to further increase the channel gain. Besides, we derive an analysis-friend expression of decoding error probability and decompose the problem into two-layered optimization problems by analyzing the monotonicity, which makes the formulated problem analytically tractable. Extensive simulations have been conducted to verify the performance of the proposed algorithm. Simulation results show that the proposed algorithm can achieve outstanding channel estimation performance and is more energy-efficient than diverse benchmark algorithms.
3D dynamic point cloud (DPC) compression relies on mining its temporal context, which faces significant challenges due to DPC's sparsity and non-uniform structure. Existing methods are limited in capturing sufficient temporal dependencies. Therefore, this paper proposes a learning-based DPC compression framework via hierarchical block-matching-based inter-prediction module to compensate and compress the DPC geometry in latent space. Specifically, we propose a hierarchical motion estimation and motion compensation (Hie-ME/MC) framework for flexible inter-prediction, which dynamically selects the granularity of optical flow to encapsulate the motion information accurately. To improve the motion estimation efficiency of the proposed inter-prediction module, we further design a KNN-attention block matching (KABM) network that determines the impact of potential corresponding points based on the geometry and feature correlation. Finally, we compress the residual and the multi-scale optical flow with a fully-factorized deep entropy model. The experiment result on the MPEG-specified Owlii Dynamic Human Dynamic Point Cloud (Owlii) dataset shows that our framework outperforms the previous state-of-the-art methods and the MPEG standard V-PCC v18 in inter-frame low-delay mode.
Spectral Embedding (SE) has often been used to map data points from non-linear manifolds to linear subspaces for the purpose of classification and clustering. Despite significant advantages, the subspace structure of data in the original space is not preserved in the embedding space. To address this issue subspace clustering has been proposed by replacing the SE graph affinity with a self-expression matrix. It works well if the data lies in a union of linear subspaces however, the performance may degrade in real-world applications where data often spans non-linear manifolds. To address this problem we propose a novel structure-aware deep spectral embedding by combining a spectral embedding loss and a structure preservation loss. To this end, a deep neural network architecture is proposed that simultaneously encodes both types of information and aims to generate structure-aware spectral embedding. The subspace structure of the input data is encoded by using attention-based self-expression learning. The proposed algorithm is evaluated on six publicly available real-world datasets. The results demonstrate the excellent clustering performance of the proposed algorithm compared to the existing state-of-the-art methods. The proposed algorithm has also exhibited better generalization to unseen data points and it is scalable to larger datasets without requiring significant computational resources.
Deep learning based channel state information (CSI) feedback in frequency division duplex systems has drawn widespread attention in both academia and industry. In this paper, we focus on integrating the Type-II codebook in the wireless communication standards with deep learning to enhance the performance of CSI feedback. In contrast to the existing deep learning based studies on the Release 16 Type-II codebook, the Type-II codebook in Release 17 (R17) exploits the angular-delay-domain partial reciprocity between uplink and downlink channels to select part of angular-delay-domain ports for measuring and feeding back the downlink CSI, where the performance of deep learning based conventional methods is limited due to the deficiency of sparse structures. To address this issue, we propose two new perspectives of adopting deep learning to improve the R17 Type-II codebook. Firstly, considering the low signal-to-noise ratio of uplink channels, deep learning is utilized to accurately select the dominant angular-delay-domain ports, where the focal loss is harnessed to solve the class imbalance problem. Secondly, we propose to adopt deep learning to reconstruct the downlink CSI based on the feedback of the R17 Type-II codebook at the base station, where the information of sparse structures can be effectively leveraged. Furthermore, a weighted shortcut module is designed to facilitate the accurate reconstruction, and a two-stage loss function that combines the mean squared error and sum rate is proposed for adapting to practical multi-user scenarios. Simulation results demonstrate that our proposed deep learning based port selection and CSI reconstruction methods can improve the sum rate performance compared with the traditional R17 Type-II codebook and deep learning benchmarks.
Recurrent neural networks are a powerful means to cope with time series. We show how autoregressive linear, i.e., linearly activated recurrent neural networks (LRNNs) can approximate any time-dependent function f(t) given by a number of function values. The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed. Furthermore, and this is probably the main contribution of this article, the size of an LRNN can be reduced significantly in one step after inspecting the spectrum of the network transition matrix, i.e., its eigenvalues, by taking only the most relevant components. Therefore, in contrast to other approaches, we do not only learn network weights but also the network architecture. LRNNs have interesting properties: They end up in ellipse trajectories in the long run and allow the prediction of further values and compact representations of functions. We demonstrate this by several experiments, among them multiple superimposed oscillators (MSO), robotic soccer, and predicting stock prices. LRNNs outperform the previous state-of-the-art for the MSO task with a minimal number of units.
Users online tend to join polarized groups of like-minded peers around shared narratives, forming echo chambers. The echo chamber effect and opinion polarization may be driven by several factors including human biases in information consumption and personalized recommendations produced by feed algorithms. Until now, studies have mainly used opinion dynamic models to explore the mechanisms behind the emergence of polarization and echo chambers. The objective was to determine the key factors contributing to these phenomena and identify their interplay. However, the validation of model predictions with empirical data still displays two main drawbacks: lack of systematicity and qualitative analysis. In our work, we bridge this gap by providing a method to numerically compare the opinion distributions obtained from simulations with those measured on social media. To validate this procedure, we develop an opinion dynamic model that takes into account the interplay between human and algorithmic factors. We subject our model to empirical testing with data from diverse social media platforms and benchmark it against two state-of-the-art models. To further enhance our understanding of social media platforms, we provide a synthetic description of their characteristics in terms of the model's parameter space. This representation has the potential to facilitate the refinement of feed algorithms, thus mitigating the detrimental effects of extreme polarization on online discourse.
This two-part paper develops a paradigmatic theory and detailed methods of the joint electricity market design using reinforcement-learning (RL)-based simulation. In Part 2, this theory is further demonstrated by elaborating detailed methods of designing an electricity spot market (ESM), together with a reserved capacity product (RC) in the ancillary service market (ASM) and a virtual bidding (VB) product in the financial market (FM). Following the theory proposed in Part 1, firstly, market design options in the joint market are specified. Then, the Markov game model is developed, in which we show how to incorporate market design options and uncertain risks in model formulation. A multi-agent policy proximal optimization (MAPPO) algorithm is elaborated, as a practical implementation of the generalized market simulation method developed in Part 1. Finally, the case study demonstrates how to pick the best market design options by using some of the market operation performance indicators proposed in Part 1, based on the simulation results generated by implementing the MAPPO algorithm. The impacts of different market design options on market participants' bidding strategy preference are also discussed.
Computing is a critical driving force in the development of human civilization. In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications. Intelligent computing has greatly broadened the scope of computing, extending it from traditional computing on data to increasingly diverse computing paradigms such as perceptual intelligence, cognitive intelligence, autonomous intelligence, and human-computer fusion intelligence. Intelligence and computing have undergone paths of different evolution and development for a long time but have become increasingly intertwined in recent years: intelligent computing is not only intelligence-oriented but also intelligence-driven. Such cross-fertilization has prompted the emergence and rapid advancement of intelligent computing. Intelligent computing is still in its infancy and an abundance of innovations in the theories, systems, and applications of intelligent computing are expected to occur soon. We present the first comprehensive survey of literature on intelligent computing, covering its theory fundamentals, the technological fusion of intelligence and computing, important applications, challenges, and future perspectives. We believe that this survey is highly timely and will provide a comprehensive reference and cast valuable insights into intelligent computing for academic and industrial researchers and practitioners.
Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvolutional Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.