Improper parsing of attacker-controlled input is a leading source of software security vulnerabilities, especially when programmers transcribe informal format descriptions in RFCs into efficient parsing logic in low-level, memory unsafe languages. Several researchers have proposed formal specification languages for data formats from which efficient code can be extracted. However, distilling informal requirements into formal specifications is challenging and, despite their benefits, new, formal languages are hard for people to learn and use. In this work, we present 3DGen, a framework that makes use of AI agents to transform mixed informal input, including natural language documents (i.e., RFCs) and example inputs into format specifications in a language called 3D. To support humans in understanding and trusting the generated specifications, 3DGen uses symbolic methods to also synthesize test inputs that can be validated against an external oracle. Symbolic test generation also helps in distinguishing multiple plausible solutions. Through a process of repeated refinement, 3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C. We have evaluated 3DGen on 20 Internet standard formats, demonstrating the potential for AI-agents to produce formally verified C code at a non-trivial scale. A key enabler is the use of a domain-specific language to limit AI outputs to a class for which automated, symbolic analysis is tractable.
Cooperative driving, enabled by communication between automated vehicle systems, promises significant benefits to fuel efficiency, road capacity, and safety over single-vehicle driver assistance systems such as adaptive cruise control (ACC). However, the responsible development and implementation of these algorithms poses substantial challenges due to the need for extensive real-world testing. We address this issue and introduce OpenConvoy, an open and extensible framework designed for the implementation and assessment of cooperative driving policies on physical connected and autonomous vehicles (CAVs). We demonstrate the capabilities of OpenConvoy through a series of experiments on a convoy of multi-scale vehicles controlled by Platooning to show the stability of our system across vehicle configurations and its ability to effectively measure convoy cohesion across driving scenarios including varying degrees of communication loss.
Rust is a programming language that combines memory safety and low-level control, providing C-like performance while guaranteeing the absence of undefined behaviors by default. Rust's growing popularity has prompted research on safe and correct transpiling of existing code-bases to Rust. Existing work falls into two categories: rule-based and large language model (LLM)-based. While rule-based approaches can theoretically produce correct transpilations that maintain input-output equivalence to the original, they often yield unreadable Rust code that uses unsafe subsets of the Rust language. On the other hand, while LLM-based approaches typically produce more readable, maintainable, and safe code, they do not provide any guarantees about correctness. In this work, we present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness. VERT's only requirement is that there is Web Assembly compiler for the source language, which is true for most major languages. VERT first uses the Web Assembly compiler to obtain an oracle Rust program. In parallel, VERT uses an LLM to generate a readable candidate Rust program. This candidate is verified against the oracle, and if verification fails, we regenerate a new candidate transpilation until verification succeeds. We evaluate VERT by transpiling a suite of 1,394 programs taken from competitive programming style benchmarks. Combining Anthropic's Claude-2 and VERT increases Rust transpilations passing property-based testing from 31% to 54% and bounded model-checking from 1% to 42% compared to using Claude alone. In addition, we evaluate VERT's ability to generate non-trivial safe Rust on programs taken from real-world C projects that make significant use of pointers. Our results provide insights into the limitations of LLMs to write safe Rust.
Federated learning is highly valued due to its high-performance computing in distributed environments while safeguarding data privacy. To address resource heterogeneity, researchers have proposed a semi-asynchronous federated learning (SAFL) architecture. However, the performance gap between different aggregation targets in SAFL remain unexplored. In this paper, we systematically compare the performance between two algorithm modes, FedSGD and FedAvg that correspond to aggregating gradients and models, respectively. Our results across various task scenarios indicate these two modes exhibit a substantial performance gap. Specifically, FedSGD achieves higher accuracy and faster convergence but experiences more severe fluctuates in accuracy, whereas FedAvg excels in handling straggler issues but converges slower with reduced accuracy.
Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding. This makes it challenging to deploy them on devices with limited resources, such as personal computers and mobile/wearable devices, and results in substantial inference costs in resource-rich environments like cloud servers. To extend the use of LLMs, we introduce a low-rank decomposition approach to effectively compress these models, tailored to the requirements of specific applications. We observe that LLMs pretrained on general datasets contain many redundant components not needed for particular applications. Our method focuses on identifying and removing these redundant parts, retaining only the necessary elements for the target applications. Specifically, we represent the weight matrices of LLMs as a linear combination of base components. We then prune the irrelevant bases and enhance the model with new bases beneficial for specific applications. Deep compression results on the Llama 2-7b and -13B models, conducted on target applications including mathematical reasoning and code generation, show that our method significantly reduces model size while maintaining comparable accuracy to state-of-the-art low-rank compression techniques.
Effective ownership of software artifacts, particularly code, is crucial for accountability, knowledge sharing, and code quality enhancement. Researchers have proposed models linking ownership of software artifacts with developer performance and code quality. Our study aims to systematically examine various ownership models and provide a structured literature overview. Conducting a systematic literature review, we identified 79 relevant papers published between 2005 and 2022. We developed a taxonomy of ownership artifacts based on type, owners, and degree of ownership, along with compiling modeling variables and analytics types used in each study. Additionally, we assessed the replication status of each study. As a result, we identified nine distinct software artifacts whose ownership has been discussed in the literature, with "Code" being the most frequently analyzed artifact. We found that only three papers (3.79%) provided code and data, whereas nine papers (11.4%) provided only data. Using our systematic literature review results, we replicated experiments on nine priority projects at \texttt{Brightsquid}. The company aimed to compare its code quality against ownership factors in other teams, so we conducted a replication study using their data. Unlike prior studies, we found no strong correlation between minor contributors and bug numbers. Surprisingly, we found no strong link between the total number of developers modifying a file and bug counts, contrasting previous findings. However, we observed a significant correlation between major contributors and bug counts, diverging from earlier research.
Finding the root causes of anomalies in cloud computing systems quickly is crucial to ensure availability and efficiency since accurate root causes can guide engineers to take appropriate actions to address the anomalies and maintain customer satisfaction. However, it is difficult to investigate and identify the root causes based on large-scale and high-dimension monitoring data collected from complex cloud computing environments. Due to the inherently dynamic characteristics of cloud computing systems, the existing approaches in practice largely rely on manual analyses for flexibility and reliability, but massive unpredictable factors and high data complexity make the process time-consuming. Despite recent advances in automated detection and investigation approaches, the speed and quality of root cause analyses remain limited by the lack of expert involvement in these approaches. The limitations found in the current solutions motivate us to propose a visual analytics approach that facilitates the interactive investigation of the anomaly root causes in cloud computing systems. We identified three challenges, namely, a) modeling databases for the root cause investigation, b) inferring root causes from large-scale time series, and c) building comprehensible investigation results. In collaboration with domain experts, we addressed these challenges with RCInvestigator, a novel visual analytics system that establishes a tight collaboration between human and machine and assists experts in investigating the root causes of cloud computing system anomalies. We evaluated the effectiveness of RCInvestigator through two use cases based on real-world data and received positive feedback from experts.
In practical distributed systems, workers are typically not homogeneous, and due to differences in hardware configurations and network conditions, can have highly varying processing times. We consider smooth nonconvex finite-sum (empirical risk minimization) problems in this setup and introduce a new parallel method, Freya PAGE, designed to handle arbitrarily heterogeneous and asynchronous computations. By being robust to "stragglers" and adaptively ignoring slow computations, Freya PAGE offers significantly improved time complexity guarantees compared to all previous methods, including Asynchronous SGD, Rennala SGD, SPIDER, and PAGE, while requiring weaker assumptions. The algorithm relies on novel generic stochastic gradient collection strategies with theoretical guarantees that can be of interest on their own, and may be used in the design of future optimization methods. Furthermore, we establish a lower bound for smooth nonconvex finite-sum problems in the asynchronous setup, providing a fundamental time complexity limit. This lower bound is tight and demonstrates the optimality of Freya PAGE in the large-scale regime, i.e., when $\sqrt{m} \geq n$, where $n$ is # of workers, and $m$ is # of data samples.
Log parsing, a vital task for interpreting the vast and complex data produced within software architectures faces significant challenges in the transition from academic benchmarks to the industrial domain. Existing log parsers, while highly effective on standardized public datasets, struggle to maintain performance and efficiency when confronted with the sheer scale and diversity of real-world industrial logs. These challenges are two-fold: 1) massive log templates: The performance and efficiency of most existing parsers will be significantly reduced when logs of growing quantities and different lengths; 2) Complex and changeable semantics: Traditional template-matching algorithms cannot accurately match the log templates of complicated industrial logs because they cannot utilize cross-language logs with similar semantics. To address these issues, we propose ECLIPSE, Enhanced Cross-Lingual Industrial log Parsing with Semantic Entropy-LCS, since cross-language logs can robustly parse industrial logs. On the one hand, it integrates two efficient data-driven template-matching algorithms and Faiss indexing. On the other hand, driven by the powerful semantic understanding ability of the Large Language Model (LLM), the semantics of log keywords were accurately extracted, and the retrieval space was effectively reduced. Notably, we launch a Chinese and English cross-platform industrial log parsing benchmark ECLIPSE- BENCH to evaluate the performance of mainstream parsers in industrial scenarios. Our experimental results across public benchmarks and ECLIPSE- BENCH underscore the superior performance and robustness of our proposed ECLIPSE. Notably, ECLIPSE both delivers state-of-the-art performance when compared to strong baselines and preserves a significant edge in processing efficiency.
The software supply chain comprises a highly complex set of operations, processes, tools, institutions and human factors involved in creating a piece of software. A number of high-profile attacks that exploit a weakness in this complex ecosystem have spurred research in identifying classes of supply chain attacks. Yet, practitioners often lack the necessary information to understand their security posture and implement suitable defenses against these attacks. We argue that the next stage of software supply chain security research and development will benefit greatly from a defense-oriented approach that focuses on holistic bottom-up solutions. To this end, this paper introduces the AStRA model, a framework for representing fundamental software supply chain elements and their causal relationships. Using this model, we identify software supply chain security objectives that are needed to mitigate common attacks and systematize knowledge on recent and well-established security techniques for their ability to meet these objectives. We validate our model against prior attacks and taxonomies. Finally, we identify emergent research gaps and propose opportunities to develop novel software development tools and systems that are secure-by-design.
Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.