Software engineering (SE) is a dynamic field that involves multiple phases all of which are necessary to develop sustainable software systems. Machine learning (ML), a branch of artificial intelligence (AI), has drawn a lot of attention in recent years thanks to its ability to analyze massive volumes of data and extract useful patterns from data. Several studies have focused on examining, categorising, and assessing the application of ML in SE processes. We conducted a literature review on primary studies to address this gap. The study was carried out following the objective and the research questions to explore the current state of the art in applying machine learning techniques in software engineering processes. The review identifies the key areas within software engineering where ML has been applied, including software quality assurance, software maintenance, software comprehension, and software documentation. It also highlights the specific ML techniques that have been leveraged in these domains, such as supervised learning, unsupervised learning, and deep learning. Keywords: machine learning, deep learning, software engineering, natural language processing, source code
Genetic programming systems often use large training sets to evaluate the quality of candidate solutions for selection, which is often computationally expensive. Down-sampling training sets has long been used to decrease the computational cost of evaluation in a wide range of application domains. More specifically, recent studies have shown that both random and informed down-sampling can substantially improve problem-solving success for GP systems that use the lexicase parent selection algorithm. We test whether these down-sampling techniques can also improve problem-solving success in the context of three other commonly used selection methods, fitness-proportionate, tournament, implicit fitness sharing plus tournament selection, across six program synthesis GP problems. We verified that down-sampling can significantly improve the problem-solving success for all three of these other selection schemes, demonstrating its general efficacy. We discern that the selection pressure imposed by the selection scheme does not interact with the down-sampling method. However, we find that informed down-sampling can improve problem solving success significantly over random down-sampling when the selection scheme has a mechanism for diversity maintenance like lexicase or implicit fitness sharing. Overall, our results suggest that down-sampling should be considered more often when solving test-based problems, regardless of the selection scheme in use.
Language model (LM) distillation is a trending area that aims to distil the knowledge residing in a large teacher LM to a small student one. While various methods have been proposed to maximize the effectiveness of the distillation, significant challenges persist, particularly when there is a substantial capacity gap between the teacher and student LMs. This issue, often referred to as the \textit{curse} of capacity gap, suggests that a larger teacher does not necessarily result in a superior student compared to one distilled from a smaller teacher. In other words, there is likely an optimal teacher yielding the best student along the scaling course of the teacher. However, the curse of capacity gap can not be tackled without notable compute overhead, as indicated in previous studies. In the context of large LMs (LLMs), previously viable approaches become much less meaningful, as it is an impossible triangle to distill an expected student from an optimal teacher student with small compute overhead. Fortunately, the impossible triangle can fortunately be possible provided an inducted \textit{law} of capacity gap. In this paper, we take the spirits of scaling law and reveal that the optimal teacher scale almost consistently follows a linear scaling with the student scale across different model architectures and data scales. The law later guides us to distil a 3B student LM (termed \textsc{MiniMA}) from LLaMA2-7B. \textsc{MiniMA} is demonstrated to outperform a wide range of 3B competitors and could even compete with several 7B models.
Dual quaternions and dual quaternion matrices have found widespread applications in robotic research, with their spectral theory been extensively studied in recent years. This paper delves into the realm of eigenvalue computation and related problems specific to dual quaternion Hermitian matrices. We establish the connection between dual quaternion matrices and their dual complex adjoint matrices, concerning eigenvalue problems and matrix rank-k approximations. By integrating the dual complex adjoint matrix, we refine the power method for eigenvalue computation for dual quaternion Hermitian matrices, achieving greater numerical efficiency. Furthermore, leveraging the eigen-decomposition of dual complex adjoint matrices, we introduce a novel approach for calculating all eigenpairs of dual quaternion Hermitian matrices. This method surpasses the power method in terms of accuracy and speed and addresses its limitations, as exemplified by its application to the eigenvalue computation of Laplacian matrices, where our algorithm demonstrates significant advantages. Additionally, we apply the improved power method and optimal rank-k approximations to pose graph optimization problem, enhancing efficiency and success rates, especially under low observation conditions.
A long-standing question in automatic speech recognition research is how to attribute errors to the ability of a model to model the acoustics, versus its ability to leverage higher-order context (lexicon, morphology, syntax, semantics). We validate a novel approach which models error rates as a function of relative textual predictability, and yields a single number, $k$, which measures the effect of textual predictability on the recognizer. We use this method to demonstrate that a Wav2Vec 2.0-based model makes greater stronger use of textual context than a hybrid ASR model, in spite of not using an explicit language model, and also use it to shed light on recent results demonstrating poor performance of standard ASR systems on African-American English. We demonstrate that these mostly represent failures of acoustic--phonetic modelling. We show how this approach can be used straightforwardly in diagnosing and improving ASR.
The convergence rate of a Markov chain to its stationary distribution is typically assessed using the concept of total variation mixing time. However, this worst-case measure often yields pessimistic estimates and is challenging to infer from observations. In this paper, we advocate for the use of the average-mixing time as a more optimistic and demonstrably easier-to-estimate alternative. We further illustrate its applicability across a range of settings, from two-point to countable spaces, and discuss some practical implications.
Autonomous vehicle control is generally divided in two main areas; trajectory planning and tracking. Currently, the trajectory planning is mostly done by particle or kinematic model-based optimization controllers. The output of these planners, since they do not consider CG height and its effects, is not unique for different vehicle types, especially for high CG vehicles. As a result, the tracking controller may have to work hard to avoid vehicle handling and comfort constraints while trying to realize these sub-optimal trajectories. This paper tries to address this problem by considering a planner with simplified double track model with estimation of lateral and roll based load transfer using steady state equations and a simplified tire model to reduce solver workload. The developed planner is compared with the widely used particle and kinematic model planners in collision avoidance scenarios in both high and low acceleration conditions and with different vehicle heights.
We consider a new scheduling problem on parallel identical machines in which the number of machines is initially not known, but it follows a given probability distribution. Only after all jobs are assigned to a given number of bags, the actual number of machines is revealed. Subsequently, the jobs need to be assigned to the machines without splitting the bags. This is the stochastic version of a related problem introduced by Stein and Zhong [SODA 2018, TALG 2020] and it is, for example, motivated by bundling jobs that need to be scheduled by data centers. We present two PTASs for the stochastic setting, computing job-to-bag assignments that (i) minimize the expected maximum machine load and (ii) maximize the expected minimum machine load (like in the Santa Claus problem), respectively. The former result follows by careful enumeration combined with known PTASs. For the latter result, we introduce an intricate dynamic program that we apply to a suitably rounded instance.
Deep learning-based models are widely deployed in autonomous driving areas, especially the increasingly noticed end-to-end solutions. However, the black-box property of these models raises concerns about their trustworthiness and safety for autonomous driving, and how to debug the causality has become a pressing concern. Despite some existing research on the explainability of autonomous driving, there is currently no systematic solution to help researchers debug and identify the key factors that lead to the final predicted action of end-to-end autonomous driving. In this work, we propose a comprehensive approach to explore and analyze the causality of end-to-end autonomous driving. First, we validate the essential information that the final planning depends on by using controlled variables and counterfactual interventions for qualitative analysis. Then, we quantitatively assess the factors influencing model decisions by visualizing and statistically analyzing the response of key model inputs. Finally, based on the comprehensive study of the multi-factorial end-to-end autonomous driving system, we have developed a strong baseline and a tool for exploring causality in the close-loop simulator CARLA. It leverages the essential input sources to obtain a well-designed model, resulting in highly competitive capabilities. As far as we know, our work is the first to unveil the mystery of end-to-end autonomous driving and turn the black box into a white one. Thorough close-loop experiments demonstrate that our method can be applied to end-to-end autonomous driving solutions for causality debugging. Code will be available at //github.com/bdvisl/DriveInsight.
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.