5G and Beyond Networks become increasingly complex and heterogeneous, with diversified and high requirements from a wide variety of emerging applications. The complexity and diversity of Telecom networks place an increasing strain on maintenance and operation efforts. Moreover, the strict security and privacy requirements present a challenge for mobile operators to leverage network data. To detect network faults, and mitigate future failures, prior work focused on leveraging traditional ML/DL methods to locate anomalies in networks. The current approaches, although powerful, do not consider the intertwined nature of embedded and software-intensive Radio Access Network systems. In this paper, we propose a Bi-level Federated Graph Neural Network anomaly detection and diagnosis model that is able to detect anomalies in Telecom networks in a privacy-preserving manner, while minimizing communication costs. Our method revolves around conceptualizing Telecom data as a bi-level temporal Graph Neural Networks. The first graph captures the interactions between different RAN nodes that are exposed to different deployment scenarios in the network, while each individual Radio Access Network node is further elaborated into its software (SW) execution graph. Additionally, we use Federated Learning to address privacy and security limitations. Furthermore, we study the performance of anomaly detection model under three settings: (1) Centralized (2) Federated Learning and (3) Personalized Federated Learning using real-world data from an operational network. Our comprehensive experiments showed that Personalized Federated Temporal Graph Neural Networks method outperforms the most commonly used techniques for Anomaly Detection.
The emergence of pretrained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets. Traditionally, these models are assessed through fine-tuned downstream tasks. However, this raises the question of how to evaluate these models more efficiently and more effectively. In this study, we explore a novel approach where we leverage the meta features associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta features as a metric for evaluating pretrained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf Feature Selection procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based Feature Selection procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines, accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for Support Vector Machines with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed Feature Selection procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved.
Large Language Model (LLM) editing modifies factual information in LLMs. Locate-and-Edit (L\&E) methods accomplish this by finding where relevant information is stored within the neural network, and editing the weights at that location. The goal of editing is to modify the response of an LLM to a proposition independently of its phrasing, while not modifying its response to other related propositions. Existing methods are limited to binary propositions, which represent straightforward binary relations between a subject and an object. Furthermore, existing methods rely on semantic subject labels, which may not be available or even be well-defined in practice. In this paper, we show that both of these issues can be effectively skirted with a simple and fast localization method called Gradient Tracing (GT). This localization method allows editing arbitrary propositions instead of just binary ones, and does so without the need for subject labels. As propositions always have a truth value, our experiments prompt an LLM as a boolean classifier, and edit its T/F response to propositions. Our method applies GT for location tracing, and then edit the model at that location using a mild variant of Rank-One Model Editing (ROME). On datasets of binary propositions derived from the CounterFact dataset, we show that our method -- without access to subject labels -- performs close to state-of-the-art L\&E methods which has access subject labels. We then introduce a new dataset, Factual Accuracy Classification Test (FACT), which includes non-binary propositions and for which subject labels are not generally applicable, and therefore is beyond the scope of existing L\&E methods. Nevertheless, we show that with our method editing is possible on FACT.
Environmental disasters such as floods, hurricanes, and wildfires have increasingly threatened communities worldwide, prompting various mitigation strategies. Among these, property buyouts have emerged as a prominent approach to reducing vulnerability to future disasters. This strategy involves governments purchasing at-risk properties from willing sellers and converting the land into open space, ostensibly reducing future disaster risk and impact. However, the aftermath of these buyouts, particularly concerning land-use patterns and community impacts, remains under-explored. This research aims to fill this gap by employing innovative techniques like satellite imagery analysis and deep learning to study these patterns. To achieve this goal, we employed FEMA's Hazard Mitigation Grant Program (HMGP) buyout dataset, encompassing over 41,004 addresses of these buyout properties from 1989 to 2017. Leveraging Google's Maps Static API, we gathered 40,053 satellite images corresponding to these buyout lands. Subsequently, we implemented five cutting-edge machine learning models to evaluate their performance in classifying land cover types. Notably, this task involved multi-class classification, and our model achieved an outstanding ROC-AUC score of 98.86%
This manuscript delves into the intersection of genomics and phenotypic prediction, focusing on the statistical innovation required to navigate the complexities introduced by noisy covariates and confounders. The primary emphasis is on the development of advanced robust statistical models tailored for genomic prediction from single nucleotide polymorphism (SNP) data collected from genome-wide association studies (GWAS) in plant and animal breeding and multi-field trials. The manuscript explores the limitations of traditional marker-assisted recurrent selection, highlighting the significance of incorporating all estimated effects of marker loci into the statistical framework and aiming to reduce the high dimensionality of GWAS data while preserving critical information. This paper introduces a new robust statistical framework for genomic prediction, employing one-stage and two-stage linear mixed model analyses along with utilizing the popular robust minimum density power divergence estimator (MDPDE) to estimate genetic effects on phenotypic traits. The study illustrates the superior performance of the proposed MDPDE-based genomic prediction and associated heritability estimation procedures over existing competitors through extensive empirical experiments on artificial datasets and application to a real-life maize breeding dataset. The results showcase the robustness and accuracy of the proposed MDPDE-based approaches, especially in the presence of data contamination, emphasizing their potential applications in improving breeding programs and advancing genomic prediction of phenotyping traits.
Multimodal Large Language Models (MLLMs) are experiencing rapid growth, yielding a plethora of noteworthy contributions in recent months. The prevailing trend involves adopting data-driven methodologies, wherein diverse instruction-following datasets are collected. However, a prevailing challenge persists in these approaches, specifically in relation to the limited visual perception ability, as CLIP-like encoders employed for extracting visual information from inputs. Though these encoders are pre-trained on billions of image-text pairs, they still grapple with the information loss dilemma, given that textual captions only partially capture the contents depicted in images. To address this limitation, this paper proposes to improve the visual perception ability of MLLMs through a mixture-of-experts knowledge enhancement mechanism. Specifically, we introduce a novel method that incorporates multi-task encoders and visual tools into the existing MLLMs training and inference pipeline, aiming to provide a more comprehensive and accurate summarization of visual inputs. Extensive experiments have evaluated its effectiveness of advancing MLLMs, showcasing improved visual perception achieved through the integration of visual experts.
Although numerous R-peak detectors have been proposed in the literature, their robustness and performance levels may significantly deteriorate in low-quality and noisy signals acquired from mobile electrocardiogram (ECG) sensors, such as Holter monitors. Recently, this issue has been addressed by deep 1-D convolutional neural networks (CNNs) that have achieved state-of-the-art performance levels in Holter monitors; however, they pose a high complexity level that requires special parallelized hardware setup for real-time processing. On the other hand, their performance deteriorates when a compact network configuration is used instead. This is an expected outcome as recent studies have demonstrated that the learning performance of CNNs is limited due to their strictly homogenous configuration with the sole linear neuron model. In this study, to further boost the peak detection performance along with an elegant computational efficiency, we propose 1-D Self-Organized ONNs (Self-ONNs) with generative neurons. The most crucial advantage of 1-D Self-ONNs over the ONNs is their self-organization capability that voids the need to search for the best operator set per neuron since each generative neuron has the ability to create the optimal operator during training. The experimental results over the China Physiological Signal Challenge-2020 (CPSC) dataset with more than one million ECG beats show that the proposed 1-D Self-ONNs can significantly surpass the state-of-the-art deep CNN with less computational complexity. Results demonstrate that the proposed solution achieves a 99.10% F1-score, 99.79% sensitivity, and 98.42% positive predictivity in the CPSC dataset, which is the best R-peak detection performance ever achieved.
Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.
We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.
Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial examples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate adversarial perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply AdvGAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.