Preserving cultural heritage is of paramount importance. In the domain of art restoration, developing a computer vision model capable of effectively restoring deteriorated images of art pieces was difficult, but now we have a good computer vision state-of-art. Traditional restoration methods are often time-consuming and require extensive expertise. The aim of this work is to design an automated solution based on computer vision models that can enhance and reconstruct degraded artworks, improving their visual quality while preserving their original characteristics and artifacts. The model should handle a diverse range of deterioration types, including but not limited to noise, blur, scratches, fading, and other common forms of degradation. We adapt the current state-of-art for the image super-resolution based on the Diffusion Model (DM) and fine-tune it for Image art restoration. Our results show that instead of fine-tunning multiple different models for different kinds of degradation, fine-tuning one super-resolution. We train it on multiple datasets to make it robust. code link: //github.com/Naagar/art_restoration_DM
As the IT industry advances, system log data becomes increasingly crucial. Many computer systems rely on log texts for management due to restricted access to source code. The need for log anomaly detection is growing, especially in real-world applications, but identifying anomalies in rapidly accumulating logs remains a challenging task. Traditional deep learning-based anomaly detection models require dataset-specific training, leading to corresponding delays. Notably, most methods only focus on sequence-level log information, which makes the detection of subtle anomalies harder, and often involve inference processes that are difficult to utilize in real-time. We introduce RAPID, a model that capitalizes on the inherent features of log data to enable anomaly detection without training delays, ensuring real-time capability. RAPID treats logs as natural language, extracting representations using pre-trained language models. Given that logs can be categorized based on system context, we implement a retrieval-based technique to contrast test logs with the most similar normal logs. This strategy not only obviates the need for log-specific training but also adeptly incorporates token-level information, ensuring refined and robust detection, particularly for unseen logs. We also propose the core set technique, which can reduce the computational cost needed for comparison. Experimental results show that even without training on log data, RAPID demonstrates competitive performance compared to prior models and achieves the best performance on certain datasets. Through various research questions, we verified its capability for real-time detection without delay.
Several visual tasks, such as pedestrian detection and image-to-image translation, are challenging to accomplish in low light using RGB images. Heat variation of objects in thermal images can be used to overcome this. In this work, an end-to-end framework, which consists of a generative network and a detector network, is proposed to translate RGB image into Thermal ones and compare generated thermal images with real data. We have collected images from two different locations using the Parrot Anafi Thermal drone. After that, we created a two-stream network, preprocessed, augmented, the image data, and trained the generator and discriminator models from scratch. The findings demonstrate that it is feasible to translate RGB training data to thermal data using GAN. As a result, thermal data can now be produced more quickly and affordably, which is useful for security and surveillance applications.
The fundamental diagram serves as the foundation of traffic flow modeling for almost a century. With the increasing availability of road sensor data, deterministic parametric models have proved inadequate in describing the variability of real-world data, especially in congested area of the density-flow diagram. In this paper we estimate the stochastic density-flow relation introducing a nonparametric method called convex quantile regression. The proposed method does not depend on any prior functional form assumptions, but thanks to the concavity constraints, the estimated function satisfies the theoretical properties of the density-flow curve. The second contribution is to develop the new convex quantile regression with bags (CQRb) approach to facilitate practical implementation of CQR to the real-world data. We illustrate the CQRb estimation process using the road sensor data from Finland in years 2016-2018. Our third contribution is to demonstrate the excellent out-of-sample predictive power of the proposed CQRb method in comparison to the standard parametric deterministic approach.
The remarkable generative capabilities of denoising diffusion models have raised new concerns regarding the authenticity of the images we see every day on the Internet. However, the vast majority of existing deepfake detection models are tested against previous generative approaches (e.g. GAN) and usually provide only a "fake" or "real" label per image. We believe a more informative output would be to augment the per-image label with a localization map indicating which regions of the input have been manipulated. To this end, we frame this task as a weakly-supervised localization problem and identify three main categories of methods (based on either explanations, local scores or attention), which we compare on an equal footing by using the Xception network as the common backbone architecture. We provide a careful analysis of all the main factors that parameterize the design space: choice of method, type of supervision, dataset and generator used in the creation of manipulated images; our study is enabled by constructing datasets in which only one of the components is varied. Our results show that weakly-supervised localization is attainable, with the best performing detection method (based on local scores) being less sensitive to the looser supervision than to the mismatch in terms of dataset or generator.
In the realm of the Internet of Things (IoT), deploying deep learning models to process data generated or collected by IoT devices is a critical challenge. However, direct data transmission can cause network congestion and inefficient execution, given that IoT devices typically lack computation and communication capabilities. Centralized data processing in data centers is also no longer feasible due to concerns over data privacy and security. To address these challenges, we present an innovative Edge-assisted U-Shaped Split Federated Learning (EUSFL) framework, which harnesses the high-performance capabilities of edge servers to assist IoT devices in model training and optimization process. In this framework, we leverage Federated Learning (FL) to enable data holders to collaboratively train models without sharing their data, thereby enhancing data privacy protection by transmitting only model parameters. Additionally, inspired by Split Learning (SL), we split the neural network into three parts using U-shaped splitting for local training on IoT devices. By exploiting the greater computation capability of edge servers, our framework effectively reduces overall training time and allows IoT devices with varying capabilities to perform training tasks efficiently. Furthermore, we proposed a novel noise mechanism called LabelDP to ensure that data features and labels can securely resist reconstruction attacks, eliminating the risk of privacy leakage. Our theoretical analysis and experimental results demonstrate that EUSFL can be integrated with various aggregation algorithms, maintaining good performance across different computing capabilities of IoT devices, and significantly reducing training time and local computation overhead.
We study single-image super-resolution algorithms for photons at collider experiments based on generative adversarial networks. We treat the energy depositions of simulated electromagnetic showers of photons and neutral-pion decays in a toy electromagnetic calorimeter as 2D images and we train super-resolution networks to generate images with an artificially increased resolution by a factor of four in each dimension. The generated images are able to reproduce features of the electromagnetic showers that are not obvious from the images at nominal resolution. Using the artificially-enhanced images for the reconstruction of shower-shape variables and of the position of the shower center results in significant improvements. We additionally investigate the utilization of the generated images as a pre-processing step for deep-learning photon-identification algorithms and observe improvements in the case of training samples of small size.
This paper considers image change detection with only a small number of samples, which is a significant problem in terms of a few annotations available. A major impediment of image change detection task is the lack of large annotated datasets covering a wide variety of scenes. Change detection models trained on insufficient datasets have shown poor generalization capability. To address the poor generalization issue, we propose using simple image processing methods for generating synthetic but informative datasets, and design an early fusion network based on object detection which could outperform the siamese neural network. Our key insight is that the synthetic data enables the trained model to have good generalization ability for various scenarios. We compare the model trained on the synthetic data with that on the real-world data captured from a challenging dataset, CDNet, using six different test sets. The results demonstrate that the synthetic data is informative enough to achieve higher generalization ability than the insufficient real-world data. Besides, the experiment shows that utilizing a few (often tens of) samples to fine-tune the model trained on the synthetic data will achieve excellent results.
Heterogeneous graph neural networks (HGNNs) as an emerging technique have shown superior capacity of dealing with heterogeneous information network (HIN). However, most HGNNs follow a semi-supervised learning manner, which notably limits their wide use in reality since labels are usually scarce in real applications. Recently, contrastive learning, a self-supervised method, becomes one of the most exciting learning paradigms and shows great potential when there are no labels. In this paper, we study the problem of self-supervised HGNNs and propose a novel co-contrastive learning mechanism for HGNNs, named HeCo. Different from traditional contrastive learning which only focuses on contrasting positive and negative samples, HeCo employs cross-viewcontrastive mechanism. Specifically, two views of a HIN (network schema and meta-path views) are proposed to learn node embeddings, so as to capture both of local and high-order structures simultaneously. Then the cross-view contrastive learning, as well as a view mask mechanism, is proposed, which is able to extract the positive and negative embeddings from two views. This enables the two views to collaboratively supervise each other and finally learn high-level node embeddings. Moreover, two extensions of HeCo are designed to generate harder negative samples with high quality, which further boosts the performance of HeCo. Extensive experiments conducted on a variety of real-world networks show the superior performance of the proposed methods over the state-of-the-arts.
Deep learning techniques have received much attention in the area of image denoising. However, there are substantial differences in the various types of deep learning methods dealing with image denoising. Specifically, discriminative learning based on deep learning can ably address the issue of Gaussian noise. Optimization models based on deep learning are effective in estimating the real noise. However, there has thus far been little related research to summarize the different deep learning techniques for image denoising. In this paper, we offer a comparative study of deep techniques in image denoising. We first classify the deep convolutional neural networks (CNNs) for additive white noisy images; the deep CNNs for real noisy images; the deep CNNs for blind denoising and the deep CNNs for hybrid noisy images, which represents the combination of noisy, blurred and low-resolution images. Then, we analyze the motivations and principles of the different types of deep learning methods. Next, we compare the state-of-the-art methods on public denoising datasets in terms of quantitative and qualitative analysis. Finally, we point out some potential challenges and directions of future research.
Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.