The timely detection of anomalies is essential in the telecom domain as it facilitates the identification and characterization of irregular patterns, abnormal behaviors, and network anomalies, contributing to enhanced service quality and operational efficiency. Precisely forecasting and eliminating predictable time series patterns constitutes a vital component of time series anomaly detection. While the state-of-the-art methods aim to maximize forecasting accuracy, the computational performance takes a hit. In a system composed of a large number of time series variables, e.g., cell Key Performance Indicators (KPIs), the time and space complexity of the forecasting employed is of crucial importance. Quartile-Based Seasonality Decomposition (QBSD) is a live forecasting method proposed in this paper to make an optimal trade-off between computational complexity and forecasting accuracy. This paper compares the performance of QBSD to the state-of-the-art forecasting methods and their applicability to practical anomaly detection. To demonstrate the efficacy of the proposed solution, experimental evaluation was conducted using publicly available datasets as well as a telecom KPI dataset.
The ability to learn new concepts sequentially is a major weakness for modern neural networks, which hinders their use in non-stationary environments. Their propensity to fit the current data distribution to the detriment of the past acquired knowledge leads to the catastrophic forgetting issue. In this work we tackle the problem of Spoken Language Understanding applied to a continual learning setting. We first define a class-incremental scenario for the SLURP dataset. Then, we propose three knowledge distillation (KD) approaches to mitigate forgetting for a sequence-to-sequence transformer model: the first KD method is applied to the encoder output (audio-KD), and the other two work on the decoder output, either directly on the token-level (tok-KD) or on the sequence-level (seq-KD) distributions. We show that the seq-KD substantially improves all the performance metrics, and its combination with the audio-KD further decreases the average WER and enhances the entity prediction metric.
Denoising diffusion probabilistic models that were initially proposed for realistic image generation have recently shown success in various perception tasks (e.g., object detection and image segmentation) and are increasingly gaining attention in computer vision. However, extending such models to multi-frame human pose estimation is non-trivial due to the presence of the additional temporal dimension in videos. More importantly, learning representations that focus on keypoint regions is crucial for accurate localization of human joints. Nevertheless, the adaptation of the diffusion-based methods remains unclear on how to achieve such objective. In this paper, we present DiffPose, a novel diffusion architecture that formulates video-based human pose estimation as a conditional heatmap generation problem. First, to better leverage temporal information, we propose SpatioTemporal Representation Learner which aggregates visual evidences across frames and uses the resulting features in each denoising step as a condition. In addition, we present a mechanism called Lookup-based MultiScale Feature Interaction that determines the correlations between local joints and global contexts across multiple scales. This mechanism generates delicate representations that focus on keypoint regions. Altogether, by extending diffusion models, we show two unique characteristics from DiffPose on pose estimation task: (i) the ability to combine multiple sets of pose estimates to improve prediction accuracy, particularly for challenging joints, and (ii) the ability to adjust the number of iterative steps for feature refinement without retraining the model. DiffPose sets new state-of-the-art results on three benchmarks: PoseTrack2017, PoseTrack2018, and PoseTrack21.
Network slicing (NS) is a key technology in 5G networks that enables the customization and efficient sharing of network resources to support the diverse requirements of nextgeneration services. This paper proposes a resource allocation scheme for NS based on the Fisher-market model and the Trading-post mechanism. The scheme aims to achieve efficient resource utilization while ensuring multi-level fairness, dynamic load conditions, and the protection of service level agreements (SLAs) for slice tenants. In the proposed scheme, each service provider (SP) is allocated a budget representing its infrastructure share or purchasing power in the market. SPs acquire different resources by spending their budgets to offer services to different classes of users, classified based on their service needs and priorities. The scheme assumes that SPs employ the $\alpha$-fairness criteria to deliver services to their subscribers. The resource allocation problem is formulated as a convex optimization problem to find a market equilibrium (ME) solution that provides allocation and resource pricing. A privacy-preserving learning algorithm is developed to enable SPs to reach the ME in a decentralized manner. The performance of the proposed scheme is evaluated through theoretical analysis and extensive numerical simulations, comparing it with the Social Optimal and Static Proportional sharing schemes.
In backscatter communication (BC), a passive tag transmits information by just affecting an external electromagnetic field through load modulation. Thereby, the feed current of the excited tag antenna is modulated by adapting the passive termination load. This paper studies the achievable information rates with a freely adaptable passive load. As a prerequisite, we unify monostatic, bistatic, and ambient BC with circuit-based system modeling. We present the crucial insight that channel capacity is described by existing results on peak-power-limited quadrature Gaussian channels, because the steady-state tag current phasor lies on a disk. Consequently, we derive the channel capacity for the case of an unmodulated external field, for general passive, purely reactive, or purely resistive tag loads. We find that modulating both resistance and reactance is important for very high rates. We discuss the capacity-achieving load statistics, rate asymptotics, technical conclusions, and rate losses from value-range-constrained loads (which are found to be small for moderate constraints). We then demonstrate that near-capacity rates can be attained by more practical schemes: (i) amplitude-and-phase-shift keying on the reflection coefficient and (ii) simple load circuits of a few switched resistors and capacitors. Finally, we draw conclusions for the ambient BC channel capacity in important special cases.
Neural networks have proven to be effective at solving machine learning tasks but it is unclear whether they learn any relevant causal relationships, while their black-box nature makes it difficult for modellers to understand and debug them. We propose a novel method overcoming these issues by allowing a two-way interaction whereby neural-network-empowered machines can expose the underpinning learnt causal graphs and humans can contest the machines by modifying the causal graphs before re-injecting them into the machines. The learnt models are guaranteed to conform to the graphs and adhere to expert knowledge, some of which can also be given up-front. By building a window into the model behaviour and enabling knowledge injection, our method allows practitioners to debug networks based on the causal structure discovered from the data and underpinning the predictions. Experiments with real and synthetic tabular data show that our method improves predictive performance up to 2.4x while producing parsimonious networks, up to 7x smaller in the input layer, compared to SOTA regularised networks.
The necessity of radix conversion of numeric data is an indispensable component in any complete analysis of digital computation. In this paper, we propose a binary encoding for mixed-radix digits. Second, a variant of rANS coding based on this conversion is given, which supports parallel decoding. The simulations show that the proposed coding in serial mode has a higher throughput than the baseline (with the speed-up factor about 2X) without loss of compression ratio, and it outperforms the existing 2-way interleaving implementation.
We consider the problem of blob detection for uncertain images, such as images that have to be inferred from noisy measurements. Extending recent work motivated by astronomical applications, we propose an approach that represents the uncertainty in the position and size of a blob by a region in a three-dimensional scale space. Motivated by classic tube methods such as the taut-string algorithm, these regions are obtained from level sets of the minimizer of a total variation functional within a high-dimensional tube. The resulting non-smooth optimization problem is challenging to solve, and we compare various numerical approaches for its solution and relate them to the literature on constrained total variation denoising. Finally, the proposed methodology is illustrated on numerical experiments for deconvolution and models related to astrophysics, where it is demonstrated that it allows to represent the uncertainty in the detected blobs in a precise and physically interpretable way.
The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning. Research problems include scalable learning of coordinated agent policies and inter-agent communication; reasoning about the behaviours, goals, and composition of other agents from limited observations; and sample-efficient learning based on intrinsic motivation, curriculum learning, causal inference, and representation learning. This article provides a broad overview of the ongoing research portfolio of the group and discusses open problems for future directions.
A large number of real-world graphs or networks are inherently heterogeneous, involving a diversity of node types and relation types. Heterogeneous graph embedding is to embed rich structural and semantic information of a heterogeneous graph into low-dimensional node representations. Existing models usually define multiple metapaths in a heterogeneous graph to capture the composite relations and guide neighbor selection. However, these models either omit node content features, discard intermediate nodes along the metapath, or only consider one metapath. To address these three limitations, we propose a new model named Metapath Aggregated Graph Neural Network (MAGNN) to boost the final performance. Specifically, MAGNN employs three major components, i.e., the node content transformation to encapsulate input node attributes, the intra-metapath aggregation to incorporate intermediate semantic nodes, and the inter-metapath aggregation to combine messages from multiple metapaths. Extensive experiments on three real-world heterogeneous graph datasets for node classification, node clustering, and link prediction show that MAGNN achieves more accurate prediction results than state-of-the-art baselines.
The cross-domain recommendation technique is an effective way of alleviating the data sparsity in recommender systems by leveraging the knowledge from relevant domains. Transfer learning is a class of algorithms underlying these techniques. In this paper, we propose a novel transfer learning approach for cross-domain recommendation by using neural networks as the base model. We assume that hidden layers in two base networks are connected by cross mappings, leading to the collaborative cross networks (CoNet). CoNet enables dual knowledge transfer across domains by introducing cross connections from one base network to another and vice versa. CoNet is achieved in multi-layer feedforward networks by adding dual connections and joint loss functions, which can be trained efficiently by back-propagation. The proposed model is evaluated on two real-world datasets and it outperforms baseline models by relative improvements of 3.56\% in MRR and 8.94\% in NDCG, respectively.