In ports, a variety of tasks are carried out, and scheduling these tasks is crucial due to its significant impact on productivity, making the generation of precise plans essential. This study proposes a method to solve the Quay Crane Scheduling Problem (QCSP), a representative task scheduling problem in ports known to be NP-Hard, more quickly and accurately. First, the study suggests a method to create more accurate work plans for Quay Cranes (QCs) by learning from actual port data to accurately predict the working speed of QCs. Next, a Surrogate Model is proposed by combining a Machine Learning (ML) model with a Genetic Algorithm (GA), which is widely used to solve complex optimization problems, enabling faster and more precise exploration of solutions. Unlike methods that use fixed-dimensional chromosome encoding, the proposed methodology can provide solutions for encodings of various dimensions. To validate the performance of the newly proposed methodology, comparative experiments were conducted, demonstrating faster search speeds and improved fitness scores. The method proposed in this study can be applied not only to QCSP but also to various NP-Hard problems, and it opens up possibilities for the further development of advanced search algorithms by combining heuristic algorithms with ML models.
With the proliferation of mobile devices, the need for an efficient model to restore any degraded image has become increasingly significant and impactful. Traditional approaches typically involve training dedicated models for each specific degradation, resulting in inefficiency and redundancy. More recent solutions either introduce additional modules to learn visual prompts significantly increasing model size or incorporate cross-modal transfer from large language models trained on vast datasets, adding complexity to the system architecture. In contrast, our approach, termed RAM, takes a unified path that leverages inherent similarities across various degradations to enable both efficient and comprehensive restoration through a joint embedding mechanism without scaling up the model or relying on large multimodal models. Specifically, we examine the sub-latent space of each input, identifying key components and reweighting them in a gated manner. This intrinsic degradation awareness is further combined with contextualized attention in an X-shaped framework, enhancing local-global interactions. Extensive benchmarking in an all-in-one restoration setting confirms RAM's SOTA performance, reducing model complexity by approximately 82% in trainable parameters and 85% in FLOPs. Our code and models will be publicly available.
High-quality datasets are critical for training machine learning models, as inconsistencies in feature generation can hinder the accuracy and reliability of threat detection. For this reason, ensuring the quality of the data in network intrusion detection datasets is important. A key component of this is using reliable tools to generate the flows and features present in the datasets. This paper investigates the impact of flow exporters on the performance and reliability of machine learning models for intrusion detection. Using HERA, a tool designed to export flows and extract features, the raw network packets of two widely used datasets, UNSW-NB15 and CIC-IDS2017, were processed from PCAP files to generate new versions of these datasets. These were compared to the original ones in terms of their influence on the performance of several models, including Random Forest, XGBoost, LightGBM, and Explainable Boosting Machine. The results obtained were significant. Models trained on the HERA version of the datasets consistently outperformed those trained on the original dataset, showing improvements in accuracy and indicating a better generalisation. This highlighted the importance of flow generation in the model's ability to differentiate between benign and malicious traffic.
Object detection is a critical task in computer vision, with applications in various domains such as autonomous driving and urban scene monitoring. However, deep learning-based approaches often demand large volumes of annotated data, which are costly and difficult to acquire, particularly in complex and unpredictable real-world environments. This dependency significantly hampers the generalization capability of existing object detection techniques. To address this issue, we introduce a novel single-domain object detection generalization method, named GoDiff, which leverages a pre-trained model to enhance generalization in unseen domains. Central to our approach is the Pseudo Target Data Generation (PTDG) module, which employs a latent diffusion model to generate pseudo-target domain data that preserves source domain characteristics while introducing stylistic variations. By integrating this pseudo data with source domain data, we diversify the training dataset. Furthermore, we introduce a cross-style instance normalization technique to blend style features from different domains generated by the PTDG module, thereby increasing the detector's robustness. Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods, achieving state-of-the-art performance in autonomous driving scenarios.
Various pipes are extensively used in both industrial settings and daily life, but the pipe inspection especially those with narrow sizes are still very challenging with tremendous time and manufacturing consumed. Quadrupedal robots, inspired from patrol dogs, can be a substitution of traditional solutions but always suffer from navigation and locomotion difficulties. In this paper, we introduce a Reinforcement Learning (RL) based method to train a policy enabling the quadrupedal robots to cross narrow pipes adaptively. A new privileged visual information and a new reward function are defined to tackle the problems. Experiments on both simulation and real world scenarios were completed, demonstrated that the proposed method can achieve the pipe-crossing task even with unexpected obstacles inside.
Signed graphs allow for encoding positive and negative relations between nodes and are used to model various online activities. Node representation learning for signed graphs is a well-studied task with important applications such as sign prediction. While the size of datasets is ever-increasing, recent methods often sacrifice scalability for accuracy. We propose a novel message-passing layer architecture called Graph Spring Network (GSN) modeled after spring forces. We combine it with a Graph Neural Ordinary Differential Equations (ODEs) formalism to optimize the system dynamics in embedding space to solve a downstream prediction task. Once the dynamics is learned, embedding generation for novel datasets is done by solving the ODEs in time using a numerical integration scheme. Our GSN layer leverages the fast-to-compute edge vector directions and learnable scalar functions that only depend on nodes' distances in latent space to compute the nodes' positions. Conversely, Graph Convolution and Graph Attention Network layers rely on learnable vector functions that require the full positions of input nodes in latent space. We propose a specific implementation called Spring-Neural-Network (SPR-NN) using a set of small neural networks mimicking attracting and repulsing spring forces that we train for link sign prediction. Experiments show that our method achieves accuracy close to the state-of-the-art methods with node generation time speedup factors of up to 28,000 on large graphs.
Code explanation plays a crucial role in the software engineering domain, aiding developers in grasping code functionality efficiently. Recent work shows that the performance of LLMs for code explanation improves in a few-shot setting, especially when the few-shot examples are selected intelligently. State-of-the-art approaches for such Selective Shot Learning (SSL) include token-based and embedding-based methods. However, these SSL approaches have been evaluated on proprietary LLMs, without much exploration on open-source Code-LLMs. Additionally, these methods lack consideration for programming language syntax. To bridge these gaps, we present a comparative study and propose a novel SSL method (SSL_ner) that utilizes entity information for few-shot example selection. We present several insights and show the effectiveness of SSL_ner approach over state-of-the-art methods across two datasets. To the best of our knowledge, this is the first systematic benchmarking of open-source Code-LLMs while assessing the performances of the various few-shot examples selection approaches for the code explanation task.
Data rebalancing techniques, including oversampling and undersampling, are a common approach to addressing the challenges of imbalanced data. To tackle unresolved problems related to both oversampling and undersampling, we propose a new undersampling approach that: (i) avoids the pitfalls of noise and overlap caused by synthetic data and (ii) avoids the pitfall of under-fitting caused by random undersampling. Instead of undersampling majority data randomly, our method undersamples datapoints based on their ability to improve model loss. Using improved model loss as a proxy measurement for classification performance, our technique assesses a datapoint's impact on loss and rejects those unable to improve it. In so doing, our approach rejects majority datapoints redundant to datapoints already accepted and, thereby, finds an optimal subset of majority training data for classification. The accept/reject component of our algorithm is motivated by a bilevel optimization problem uniquely formulated to identify the optimal training set we seek. Experimental results show our proposed technique with F1 scores up to 10% higher than state-of-the-art methods.
The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.
Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into different categories. With a focus on graph convolutional networks, we review alternative architectures that have recently been developed; these learning paradigms include graph attention networks, graph autoencoders, graph generative networks, and graph spatial-temporal networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes and benchmarks of the existing algorithms on different learning tasks. Finally, we propose potential research directions in this fast-growing field.