In software engineering, the meticulous configuration of software tools is crucial in ensuring optimal performance within intricate systems. However, the complexity inherent in selecting optimal configurations is exacerbated by the high-dimensional search spaces presented in modern applications. Conventional trial-and-error or intuition-driven methods are both inefficient and error-prone, impeding scalability and reproducibility. In this study, we embark on an exploration of leveraging Large-Language Models (LLMs) to streamline the software configuration process. We identify that the task of hyperparameter configuration for machine learning components within intelligent applications is particularly challenging due to the extensive search space and performance-critical nature. Existing methods, including Bayesian optimization, have limitations regarding initial setup, computational cost, and convergence efficiency. Our work presents a novel approach that employs LLMs, such as Chat-GPT, to identify starting conditions and narrow down the search space, improving configuration efficiency. We conducted a series of experiments to investigate the variability of LLM-generated responses, uncovering intriguing findings such as potential response caching and consistent behavior based on domain-specific keywords. Furthermore, our results from hyperparameter optimization experiments reveal the potential of LLMs in expediting initialization processes and optimizing configurations. While our initial insights are promising, they also indicate the need for further in-depth investigations and experiments in this domain.
A validation methodology is proposed and implemented for natural language software specifications of standard graphics functions. Checks are made for consistency, completeness, and lack of ambiguity in data element and function descriptions. Functions and data elements are maintained in a relational database representation. The appropriate checks are performed by sequences of database operations. The relational database manager INGRES was used to support a prototype implementation of the proposed technique. The methodology supports the development of a scenario-based prototype from the information available in the specification. This permits various function sequences to be checked without implementation of the environment specified. The application of a prototype implementation of the proposed methodology to the specification of the Graphics Kernel System (GKS) software package demonstrates the practicability of the method. Several inconsistencies in GKS related to the definition of data elements have been identified.
Developers heavily rely on Application Programming Interfaces (APIs) from libraries to build their software. As software evolves, developers may need to replace the used libraries with alternate libraries, a process known as library migration. Doing this manually can be tedious, time-consuming, and prone to errors. Automated migration techniques can help alleviate some of this burden. However, designing effective automated migration techniques requires understanding the types of code changes required to transform client code that used the old library to the new library. This paper contributes an empirical study that provides a holistic view of Python library migrations, both in terms of the code changes required in a migration and the typical development effort involved. We manually label 3,096 migration-related code changes in 335 Python library migrations from 311 client repositories spanning 141 library pairs from 35 domains. Based on our labeled data, we derive a taxonomy for describing migration-related code changes, PyMigTax. Leveraging PyMigTax and our labeled data, we investigate various characteristics of Python library migrations, such as the types of program elements and properties of API mappings, the combinations of types of migration-related code changes in a migration, and the typical development effort required for a migration. Our findings highlight various potential shortcomings of current library migration tools. For example, we find that 40% of library pairs have API mappings that involve non-function program elements, while most library migration techniques typically assume that function calls from the source library will map into (one or more) function calls from the target library. As an approximation for the development effort involved, we find that, on average, a developer needs to learn about 4 APIs and 2 API mappings to perform a migration, and ... (truncated)
In terms of energy efficiency and computational speed, neuromorphic electronics based on non-volatile memory devices is expected to be one of most promising hardware candidates for future artificial intelligence (AI). However, catastrophic forgetting, networks rapidly overwriting previously learned weights when learning new tasks, remains as a pivotal hurdle in either digital or analog AI chips for unleashing the true power of brain-like computing. To address catastrophic forgetting in the context of online memory storage, a complex synapse model (the Benna-Fusi model) has been proposed recently[1], whose synaptic weight and internal variables evolve following a diffusion dynamics. In this work, by designing a proton transistor with a series of charge-diffusion-controlled storage components, we have experimentally realized the Benna-Fusi artificial complex synapse. The memory consolidation from coupled storage components is revealed by both numerical simulations and experimental observations. Different memory timescales for the complex synapse are engineered by the diffusion length of charge carriers, the capacity and number of coupled storage components. The advantage of the demonstrated complex synapse in both memory capacity and memory consolidation is revealed by neural network simulations of face familiarity detection. Our experimental realization of the complex synapse suggests a promising approach to enhance memory capacity and to enable continual learning.
Automated Program Repair (APR) is a vital area in software engineering aimed at generating automatic patches for vulnerable programs. While numerous techniques have been proposed for repairing classical programs, the realm of quantum programming lacks a comparable automated repair technique. In this initial exploration, we investigate the use of ChatGPT for quantum program repair and evaluate its performance on Bugs4Q, a benchmark suite of quantum program bugs. Our findings demonstrate the feasibility of employing ChatGPT for quantum program repair. Specifically, we assess ChatGPT's ability to address bugs within the Bugs4Q benchmark, revealing its success in repairing 29 out of 38 bugs. This research represents a promising step towards automating the repair process for quantum programs.
The recent surge in 3D data acquisition has spurred the development of geometric deep learning models for point cloud processing, boosted by the remarkable success of transformers in natural language processing. While point cloud transformers (PTs) have achieved impressive results recently, their quadratic scaling with respect to the point cloud size poses a significant scalability challenge for real-world applications. To address this issue, we propose the Adaptive Point Cloud Transformer (AdaPT), a standard PT model augmented by an adaptive token selection mechanism. AdaPT dynamically reduces the number of tokens during inference, enabling efficient processing of large point clouds. Furthermore, we introduce a budget mechanism to flexibly adjust the computational cost of the model at inference time without the need for retraining or fine-tuning separate models. Our extensive experimental evaluation on point cloud classification tasks demonstrates that AdaPT significantly reduces computational complexity while maintaining competitive accuracy compared to standard PTs. The code for AdaPT is made publicly available.
Symmetric multilevel diversity coding (SMDC) is a source coding problem where the independent sources are ordered according to their importance. It was shown that separately encoding independent sources (referred to as ``\textit{superposition coding}") is optimal. In this paper, we consider an $(L,s)$ \textit{sliding secure} SMDC problem with security priority, where each source $X_{\alpha}~(s\leq \alpha\leq L)$ is kept perfectly secure if no more than $\alpha-s$ encoders are accessible. The reconstruction requirements of the $L$ sources are the same as classical SMDC. A special case of an $(L,s)$ sliding secure SMDC problem that the first $s-1$ sources are constants is called the $(L,s)$ \textit{multilevel secret sharing} problem. For $s=1$, the two problems coincide, and we show that superposition coding is optimal. The rate regions for the $(3,2)$ problems are characterized. It is shown that superposition coding is suboptimal for both problems. The main idea that joint encoding can reduce coding rates is that we can use the previous source $X_{\alpha-1}$ as the secret key of $X_{\alpha}$. Based on this idea, we propose a coding scheme that achieves the minimum sum rate of the general $(L,s)$ multilevel secret sharing problem. Moreover, superposition coding of the $s$ sets of sources $X_1$, $X_2$, $\cdots$, $X_{s-1}$, $(X_s, X_{s+1}, \cdots, X_L)$ achieves the minimum sum rate of the general sliding secure SMDC problem.
A critical goal of autonomy and artificial intelligence is enabling autonomous robots to rapidly adapt in dynamic and uncertain environments. Classic adaptive control and safe control provide stability and safety guarantees but are limited to specific system classes. In contrast, policy adaptation based on reinforcement learning (RL) offers versatility and generalizability but presents safety and robustness challenges. We propose SafeDPA, a novel RL and control framework that simultaneously tackles the problems of policy adaptation and safe reinforcement learning. SafeDPA jointly learns adaptive policy and dynamics models in simulation, predicts environment configurations, and fine-tunes dynamics models with few-shot real-world data. A safety filter based on the Control Barrier Function (CBF) on top of the RL policy is introduced to ensure safety during real-world deployment. We provide theoretical safety guarantees of SafeDPA and show the robustness of SafeDPA against learning errors and extra perturbations. Comprehensive experiments on (1) classic control problems (Inverted Pendulum), (2) simulation benchmarks (Safety Gym), and (3) a real-world agile robotics platform (RC Car) demonstrate great superiority of SafeDPA in both safety and task performance, over state-of-the-art baselines. Particularly, SafeDPA demonstrates notable generalizability, achieving a 300% increase in safety rate compared to the baselines, under unseen disturbances in real-world experiments.
Software engineering is a domain characterized by intricate decision-making processes, often relying on nuanced intuition and consultation. Recent advancements in deep learning have started to revolutionize software engineering practices through elaborate designs implemented at various stages of software development. In this paper, we present an innovative paradigm that leverages large language models (LLMs) throughout the entire software development process, streamlining and unifying key processes through natural language communication, thereby eliminating the need for specialized models at each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered software development company that mirrors the established waterfall model, meticulously dividing the development process into four distinct chronological stages: designing, coding, testing, and documenting. Each stage engages a team of agents, such as programmers, code reviewers, and test engineers, fostering collaborative dialogue and facilitating a seamless workflow. The chat chain acts as a facilitator, breaking down each stage into atomic subtasks. This enables dual roles, allowing for proposing and validating solutions through context-aware communication, leading to efficient resolution of specific subtasks. The instrumental analysis of ChatDev highlights its remarkable efficacy in software generation, enabling the completion of the entire software development process in under seven minutes at a cost of less than one dollar. It not only identifies and alleviates potential vulnerabilities but also rectifies potential hallucinations while maintaining commendable efficiency and cost-effectiveness. The potential of ChatDev unveils fresh possibilities for integrating LLMs into the realm of software development.
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.