我的美女教师在线观看免费,狼友视频首页,欧美野外激情一区二区群干,国产一级婬片A视频免费观看

Flaky tests can pass or fail non-deterministically, without alterations to a software system. Such tests are frequently encountered by developers and hinder the credibility of test suites. State-of-the-art research incorporates machine learning solutions into flaky test detection and achieves reasonably good accuracy. Moreover, the majority of automated flaky test repair solutions are designed for specific types of flaky tests. This research work proposes a novel categorization framework, called FlaKat, which uses machine-learning classifiers for fast and accurate prediction of the category of a given flaky test that reflects its root cause. Sampling techniques are applied to address the imbalance between flaky test categories in the International Dataset of Flaky Test (IDoFT). A new evaluation metric, called Flakiness Detection Capacity (FDC), is proposed for measuring the accuracy of classifiers from the perspective of information theory and provides proof for its effectiveness. The final FDC results are also in agreement with F1 score regarding which classifier yields the best flakiness classification.

相關內容

模型評估

關注 1730

機器學習系統設計系統評估標準

MoDELS · 查準率/準確率 · Processing（編程語言） · 優化器 · 可約的 ·

2024 年 4 月 15 日

TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

Haojun Sun,Chen Tang,Zhi Wang,Yuan Meng,Jingyan jiang,Xinzhu Ma,Wenwu Zhu

Diffusion models have emerged as preeminent contenders in the realm of generative models. Distinguished by their distinctive sequential generative processes, characterized by hundreds or even thousands of timesteps, diffusion models progressively reconstruct images from pure Gaussian noise, with each timestep necessitating full inference of the entire model. However, the substantial computational demands inherent to these models present challenges for deployment, quantization is thus widely used to lower the bit-width for reducing the storage and computing overheads. Current quantization methodologies primarily focus on model-side optimization, disregarding the temporal dimension, such as the length of the timestep sequence, thereby allowing redundant timesteps to continue consuming computational resources, leaving substantial scope for accelerating the generative process. In this paper, we introduce TMPQ-DM, which jointly optimizes timestep reduction and quantization to achieve a superior performance-efficiency trade-off, addressing both temporal and model optimization aspects. For timestep reduction, we devise a non-uniform grouping scheme tailored to the non-uniform nature of the denoising process, thereby mitigating the explosive combinations of timesteps. In terms of quantization, we adopt a fine-grained layer-wise approach to allocate varying bit-widths to different layers based on their respective contributions to the final generative performance, thus rectifying performance degradation observed in prior studies. To expedite the evaluation of fine-grained quantization, we further devise a super-network to serve as a precision solver by leveraging shared quantization results. These two design components are seamlessly integrated within our framework, enabling rapid joint exploration of the exponentially large decision space via a gradient-free evolutionary search algorithm.

AI · INFORMS · 可辨認的 · Taxonomy · CASE ·

2024 年 4 月 14 日

AINeedsPlanner: A Workbook to Support Effective Collaboration Between AI Experts and Clients

Dae Hyun Kim,Hyungyu Shin,Shakhnozakhon Yadgarova,Jinho Son,Hariharan Subramonyam,Juho Kim

Clients often partner with AI experts to develop AI applications tailored to their needs. In these partnerships, careful planning and clear communication are critical, as inaccurate or incomplete specifications can result in misaligned model characteristics, expensive reworks, and potential friction between collaborators. Unfortunately, given the complexity of requirements ranging from functionality, data, and governance, effective guidelines for collaborative specification of requirements in client-AI expert collaborations are missing. In this work, we introduce AINeedsPlanner, a workbook that AI experts and clients can use to facilitate effective interchange and clear specifications. The workbook is based on (1) an interview of 10 completed AI application project teams, which identifies and characterizes steps in AI application planning and (2) a study with 12 AI experts, which defines a taxonomy of AI experts' information needs and dimensions that affect the information needs. Finally, we demonstrate the workbook's utility with two case studies in real-world settings.

原點 · 可辨認的 · 跡 · 相似度 · Engineering ·

2024 年 4 月 11 日

KestRel: Relational Verification Using E-Graphs for Program Alignment

Robert Dickerson,Prasita Mukherjee,Benjamin Delaware

Many interesting program properties involve the execution of multiple programs, including observational equivalence, noninterference, co-termination, monotonicity, and idempotency. One popular approach to reasoning about these sorts of relational properties is to construct and verify a product program: a program whose correctness implies that the individual programs exhibit the desired relational property. A key challenge in product program construction is finding a good alignment of the original programs. An alignment puts subparts of the original programs into correspondence so that their similarities can be exploited in order to simplify verification. We propose an approach to product program construction that uses e-graphs, equality saturation, and algebraic realignment rules to efficiently represent and build verifiable product programs. A key ingredient of our solution is a novel data-driven extraction technique that uses execution traces of product programs to identify candidate solutions that are semantically well-aligned. We have implemented a relational verification engine based on our proposed approach, called KestRel, and use it to evaluate our approach over a suite of benchmarks taken from the relational verification literature.

路徑 · 優化器 · 約束 · 線性的 · 回合 ·

2024 年 4 月 10 日

TOPPQuad: Dynamically-Feasible Time Optimal Path Parametrization for Quadrotors

Katherine Mao,Igor Spasojevic,M. Ani Hsieh,Vijay Kumar

Planning time-optimal trajectories for quadrotors in cluttered environments is a challenging, non-convex problem. This paper addresses minimizing the traversal time of a given collision-free geometric path without violating bounds on individual motor thrusts of the vehicle. Previous approaches have either relied on convex relaxations that do not guarantee dynamic feasibility, or have generated overly conservative time parametrizations. We propose TOPPQuad, a time-optimal path parameterization algorithm for quadrotors which explicitly incorporates quadrotor rigid body dynamics and constraints such as bounds on inputs (including motor speeds) and state of the vehicle (including the pose, linear and angular velocity and acceleration). We demonstrate the ability of the planner to generate faster trajectories that respect hardware constraints of the robot compared to several planners with relaxed notions of dynamic feasibility. We also demonstrate how TOPPQuad can be used to plan trajectories for quadrotors that utilize bidirectional motors. Overall, the proposed approach paves a way towards maximizing the efficacy of autonomous micro aerial vehicles while ensuring their safety.

機器人 · Integration · 回合 · 設計 · 論文 ·

2024 年 4 月 7 日

Legibot: Generating Legible Motions for Service Robots Using Cost-Based Local Planners

Javad Amirian,Mouad Abrini,Mohamed Chetouani

With the increasing presence of social robots in various environments and applications, there is an increasing need for these robots to exhibit socially-compliant behaviors. Legible motion, characterized by the ability of a robot to clearly and quickly convey intentions and goals to the individuals in its vicinity, through its motion, holds significant importance in this context. This will improve the overall user experience and acceptance of robots in human environments. In this paper, we introduce a novel approach to incorporate legibility into local motion planning for mobile robots. This can enable robots to generate legible motions in real-time and dynamic environments. To demonstrate the effectiveness of our proposed methodology, we also provide a robotic stack designed for deploying legibility-aware motion planning in a social robot, by integrating perception and localization components.

tuning · MoDELS · 語言模型化 · Performer · Agent ·

2024 年 4 月 4 日

CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models

Xuechen Liang,Meiling Tao,Tianyu Shi,Yiting Xie

Open large language models (LLMs) have significantly advanced the field of natural language processing, showcasing impressive performance across various tasks.Despite the significant advancements in LLMs, their effective operation still relies heavily on human input to accurately guide the dialogue flow, with agent tuning being a crucial optimization technique that involves human adjustments to the model for better response to such guidance.Addressing this dependency, our work introduces the TinyAgent model, trained on a meticulously curated high-quality dataset. We also present the Collaborative Multi-Agent Tuning (CMAT) framework, an innovative system designed to augment language agent capabilities through adaptive weight updates based on environmental feedback. This framework fosters collaborative learning and real-time adaptation among multiple intelligent agents, enhancing their context-awareness and long-term memory. In this research, we propose a new communication agent framework that integrates multi-agent systems with environmental feedback mechanisms, offering a scalable method to explore cooperative behaviors. Notably, our TinyAgent-7B model exhibits performance on par with GPT-3.5, despite having fewer parameters, signifying a substantial improvement in the efficiency and effectiveness of LLMs.

DFP · Performer · Learning · ForCES · Extensibility ·

2024 年 4 月 3 日

MRSch: Multi-Resource Scheduling for HPC

Boyang Li,Yuping Fan,Matthew Dearing,Zhiling Lan,Paul Richy,William Allcocky,Michael Papka

Emerging workloads in high-performance computing (HPC) are embracing significant changes, such as having diverse resource requirements instead of being CPU-centric. This advancement forces cluster schedulers to consider multiple schedulable resources during decision-making. Existing scheduling studies rely on heuristic or optimization methods, which are limited by an inability to adapt to new scenarios for ensuring long-term scheduling performance. We present an intelligent scheduling agent named MRSch for multi-resource scheduling in HPC that leverages direct future prediction (DFP), an advanced multi-objective reinforcement learning algorithm. While DFP demonstrated outstanding performance in a gaming competition, it has not been previously explored in the context of HPC scheduling. Several key techniques are developed in this study to tackle the challenges involved in multi-resource scheduling. These techniques enable MRSch to learn an appropriate scheduling policy automatically and dynamically adapt its policy in response to workload changes via dynamic resource prioritizing. We compare MRSch with existing scheduling methods through extensive tracebase simulations. Our results demonstrate that MRSch improves scheduling performance by up to 48% compared to the existing scheduling methods.

推斷 · ReQuEST · 縮放 · Learning · Wireless Networks ·

2024 年 3 月 31 日

Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling

Kamran Razavi,Saeid Ghafouri,Max Mühlh?user,Pooyan Jamshidi,Lin Wang

Mobile and IoT applications increasingly adopt deep learning inference to provide intelligence. Inference requests are typically sent to a cloud infrastructure over a wireless network that is highly variable, leading to the challenge of dynamic Service Level Objectives (SLOs) at the request level. This paper presents Sponge, a novel deep learning inference serving system that maximizes resource efficiency while guaranteeing dynamic SLOs. Sponge achieves its goal by applying in-place vertical scaling, dynamic batching, and request reordering. Specifically, we introduce an Integer Programming formulation to capture the resource allocation problem, providing a mathematical model of the relationship between latency, batch size, and resources. We demonstrate the potential of Sponge through a prototype implementation and preliminary experiments and discuss future works.

BART · 圖 · MoDELS · 知識圖譜 · 生成模型 ·

2021 年 1 月 21 日

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Ye Liu,Yao Wan,Lifang He,Hao Peng,Philip S. Yu

from arxiv, 10 pages, 7 figures, Appear in AAAI 2021

Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.

Performer · 判別器 · 正例 · 假陽性 · 監督 ·

2018 年 5 月 24 日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin,Weiran Xu,William Yang Wang

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.