With the recent advancement of Artificial Intelligence (AI) and the emergence of Large Language Models (LLMs), AI-based code generation tools have achieved significant progress and become a practical solution for software development. GitHub Copilot, referred to as AI pair programmer, utilizes machine learning models that are trained on a large corpus of code snippets to generate code suggestions or auto-complete code using natural language processing. Despite its popularity, there is little empirical evidence on the actual experiences of software developers who work with Copilot. To this end, we conducted an empirical study to understand the issues and challenges that developers face when using Copilot in practice, as well as their underlying causes and potential solutions. We collected data from 476 GitHub issues, 706 GitHub discussions, and 184 Stack Overflow posts, and identified the issues, causes that trigger the issues, and solutions that resolve the issues when using Copilot. Our results reveal that (1) Usage Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Issue, Network Connection Issue, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we delve into the main challenges users encounter when implementing Copilot in practical development, the possible impact of Copilot on the coding process, aspects in which Copilot can be further enhanced, and potential new features desired by Copilot users.
The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples. DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization. The core employs a multiscale depth-separable convolution block, capturing localized patterns across scales. This block is complemented by a spatial-channel squeeze and excitation (scSE) attention unit, modeling inter-dependencies between channels and spatial regions in feature maps. Additionally, additive attention gates refine segmentation by connecting encoder-decoder pathways. To augment the model, engineered features using Gabor filters for textural analysis, Sobel and Canny filters for edge detection are injected guided by semantic masks to expand the feature space strategically. Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities. Ablation studies highlight incremental benefits from attention blocks and feature injection. DAU-FI Net achieves state-of-the-art mean Intersection over Union (IoU) of 95.6% and 98.8% on the defect test set and benchmark respectively, surpassing prior methods by 8.9% and 12.6%, respectively. Ablation studies highlight incremental benefits from attention blocks and feature injection. The proposed architecture provides a robust solution, advancing semantic segmentation for multiclass problems with limited training data. Our sewer-culvert defects dataset, featuring pixel-level annotations, opens avenues for further research in this crucial domain. Overall, this work delivers key innovations in architecture, attention, and feature engineering to elevate semantic segmentation efficacy.
We investigate two efficient time discretizations for the post-processing technique of discontinuous Galerkin (DG) methods to solve hyperbolic conservation laws. The post-processing technique, which is applied at the final time of the DG method, can enhance the accuracy of the original DG solution (spatial superconvergence). One main difficulty of the post-processing technique is that the spatial superconvergence after post-processing needs to be matched with proper temporary accuracy. If the semi-discretized system (ODE system after spatial discretization) is under-resolved in time, then the space superconvergence will be concealed. In this paper, we focus our investigation on the recently designed SDG method and derive its explicit scheme from a correction process based on the DG weak formulation. We also introduce another similar technique, namely the spectral deferred correction (SDC) method. A comparison is made among both proposed time discretization techniques with the standard third-order Runge-Kutta method through several numerical examples, to conclude that both the SDG and SDC methods are efficient time discretization techniques for exploiting the spatial superconvergence of the DG methods.
We present a study on the integration of Large Language Models (LLMs) in tabular data classification, emphasizing an efficient framework. Building upon existing work done in TabLLM (arXiv:2210.10723), we introduce three novel serialization techniques, including the standout LaTeX serialization method. This method significantly boosts the performance of LLMs in processing domain-specific datasets, Our method stands out for its memory efficiency and ability to fully utilize complex data structures. Through extensive experimentation, including various serialization approaches like feature combination and importance, we demonstrate our work's superiority in accuracy and efficiency over traditional models.
To combat the potential misuse of Natural Language Generation (NLG) technology, a variety of algorithms have been developed for the detection of AI-generated texts. Traditionally, this task is treated as a binary classification problem. Although supervised learning has demonstrated promising results, acquiring labeled data for detection purposes poses real-world challenges and the risk of overfitting. In an effort to address these issues, we delve into the realm of zero-shot machine-generated text detection. Existing zero-shot detectors, typically designed for specific tasks or topics, often assume uniform testing scenarios, limiting their practicality. In our research, we explore various advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways. In empirical studies, we uncover a significant correlation between topics and detection performance. Secondly, we delve into the influence of topic shifts on zero-shot detectors. These investigations shed light on the adaptability and robustness of these detection methods across diverse topics. The code is available at \url{//github.com/yfzhang114/robustness-detection}.
With the rapid increase in the Internet of Things (IoT), the amount of data produced and processed is also increased. Cloud Computing facilitates the storage, processing, and analysis of data as needed. However, cloud computing devices are located far away from the IoT devices. Fog computing has emerged as a small cloud computing paradigm that is near to the edge devices and handles the task very efficiently. Fog nodes have a small storage capability than the cloud node but it is designed and deployed near to the edge device so that request must be accessed efficiently and executes in time. In this survey paper we have investigated and analysed the main challenges and issues raised in scheduling the task in fog computing environment. To the best of our knowledge there is no comprehensive survey paper on challenges in task scheduling of fog computing paradigm. In this survey paper research is conducted from 2018 to 2021 and most of the paper selection is done from 2020-2021. Moreover, this survey paper organizes the task scheduling approaches and technically plans the identified challenges and issues. Based on the identified issues, we have highlighted the future work directions in the field of task scheduling in fog computing environment.
Predicting the future trajectories of nearby objects plays a pivotal role in Robotics and Automation such as autonomous driving. While learning-based trajectory prediction methods have achieved remarkable performance on public benchmarks, the generalization ability of these approaches remains questionable. The poor generalizability on unseen domains, a well-recognized defect of data-driven approaches, can potentially harm the real-world performance of trajectory prediction models. We are thus motivated to improve generalization ability of models instead of merely pursuing high accuracy on average. Due to the lack of benchmarks for quantifying the generalization ability of trajectory predictors, we first construct a new benchmark called argoverse-shift, where the data distributions of domains are significantly different. Using this benchmark for evaluation, we identify that the domain shift problem seriously hinders the generalization of trajectory predictors since state-of-the-art approaches suffer from severe performance degradation when facing those out-of-distribution scenes. To enhance the robustness of models against domain shift problems, we propose a plug-and-play strategy for domain normalization in trajectory prediction. Our strategy utilizes the Frenet coordinate frame for modeling and can effectively narrow the domain gap of different scenes caused by the variety of road geometry and topology. Experiments show that our strategy noticeably boosts the prediction performance of the state-of-the-art in domains that were previously unseen to the models, thereby improving the generalization ability of data-driven trajectory prediction methods.
We propose expanding the shared Transformer module to produce and initialize Transformers of varying depths, enabling adaptation to diverse resource constraints. Drawing an analogy to genetic expansibility, we term such module as learngene. To identify the expansion mechanism, we delve into the relationship between the layer's position and its corresponding weight value, and find that linear function appropriately approximates this relationship. Building on this insight, we present Transformer as Linear Expansion of learnGene (TLEG), a novel approach for flexibly producing and initializing Transformers of diverse depths. Specifically, to learn learngene, we firstly construct an auxiliary Transformer linearly expanded from learngene, after which we train it through employing soft distillation. Subsequently, we can produce and initialize Transformers of varying depths via linearly expanding the well-trained learngene, thereby supporting diverse downstream scenarios. Extensive experiments on ImageNet-1K demonstrate that TLEG achieves comparable or better performance in contrast to many individual models trained from scratch, while reducing around 2x training cost. When transferring to several downstream classification datasets, TLEG surpasses existing initialization methods by a large margin (e.g., +6.87% on iNat 2019 and +7.66% on CIFAR-100). Under the situation where we need to produce models of varying depths adapting for different resource constraints, TLEG achieves comparable results while reducing around 19x parameters stored to initialize these models and around 5x pre-training costs, in contrast to the pre-training and fine-tuning approach. When transferring a fixed set of parameters to initialize different models, TLEG presents better flexibility and competitive performance while reducing around 2.9x parameters stored to initialize, compared to the pre-training approach.
This article provides quasi-optimal a priori error estimates for an optimal control problem constrained by an elliptic obstacle problem where the finite element discretization is carried out using the symmetric interior penalty discontinuous Galerkin method. The main proofs are based on the improved $L^2$-error estimates for the obstacle problem, the discrete maximum principle, and a well-known quadratic growth property. The standard (restrictive) assumptions on mesh are not assumed here.
Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code translation. The prerequisite for advancing the state of LLM-based code translation is to understand their promises and limitations over existing techniques. To that end, we present a large-scale empirical study to investigate the ability of general LLMs and code LLMs for code translation across pairs of different languages, including C, C++, Go, Java, and Python. Our study, which involves the translation of 1,700 code samples from three benchmarks and two real-world projects, reveals that LLMs are yet to be reliably used to automate code translation -- with correct translations ranging from 2.1% to 47.3% for the studied LLMs. Further manual investigation of unsuccessful translations identifies 15 categories of translation bugs. We also compare LLM-based code translation with traditional non-LLM-based approaches. Our analysis shows that these two classes of techniques have their own strengths and weaknesses. Finally, insights from our study suggest that providing more context to LLMs during translation can help them produce better results. To that end, we propose a prompt-crafting approach based on the symptoms of erroneous translations; this improves the performance of LLM-based code translation by 5.5% on average. Our study is the first of its kind, in terms of scale and breadth, that provides insights into the current limitations of LLMs in code translation and opportunities for improving them. Our dataset -- consisting of 1,700 code samples in five PLs with 10K+ tests, 43K+ translated code, 1,725 manually labeled bugs, and 1,365 bug-fix pairs -- can help drive research in this area.
Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.