国产免费一区二区三区在线能观看_日韩一区二区综合精品_成人免费国产精品视频_日韩欧美精品视频在线观看_久久人妻偷偷综合中文字幕_欧美黄色网站免费播放_毛片A天日日夜久久久

On open source software (OSS) platforms such as GitHub, forking and accepting pull-requests is an important approach for OSS projects to receive contributions, especially from external contributors who cannot directly commit into the source repositories. Having a large number of forks is often considered as an indicator of a project being popular. While extensive studies have been conducted to understand the reasons of forking, communications between forks, features and impacts of forks, there are few quantitative measures that can provide a simple yet informative way to gain insights about an OSS project's forks besides their count. Inspired by studies on biodiversity and OSS team diversity, in this paper, we propose an approach to measure the diversity of an OSS project's forks (i.e., its fork population). We devise a novel fork entropy metric based on Rao's quadratic entropy to measure such diversity according to the forks' modifications to project files. With properties including symmetry, continuity, and monotonicity, the proposed fork entropy metric is effective in quantifying the diversity of a project's fork population. To further examine the usefulness of the proposed metric, we conduct empirical studies with data retrieved from fifty projects on GitHub. We observe significant correlations between a project's fork entropy and different outcome variables including the project's external productivity measured by the number of external contributors' commits, acceptance rate of external contributors' pull-requests, and the number of reported bugs. We also observe significant interactions between fork entropy and other factors such as the number of forks. The results suggest that fork entropy effectively enriches our understanding of OSS projects' forks beyond the simple number of forks, and can potentially support further research and applications.

相關內容

Projection

關注 1

可辨認的 · Analysis · 在線 · Google Play · Automator ·

2023 年 11 月 2 日

Inclusiveness Matters: A Large-Scale Analysis of User Feedback

Nowshin Nawar Arony,Ze Shi Li,Bowen Xu,Daniela Damian

In an era of rapidly expanding software usage, catering to the diverse needs of users from various backgrounds has become a critical challenge. Inclusiveness, representing a core human value, is frequently overlooked during software development, leading to user dissatisfaction. Users often engage in discourse on online platforms where they indicate their concerns. In this study, we leverage user feedback from three popular online sources, Reddit, Google Play Store, and Twitter, for 50 of the most popular apps in the world to reveal the inclusiveness-related concerns from end users. Using a Socio-Technical Grounded Theory approach, we analyzed 23,107 posts across the three sources and identified 1,211 inclusiveness related posts. We organize our empirical results in a taxonomy for inclusiveness comprising 6 major categories: Fairness, Technology, Privacy, Demography, Usability, and Other Human Values. To explore automated support to identifying inclusiveness-related posts, we experimented with five state-of-the-art pre-trained large language models (LLMs) and found that these models' effectiveness is high and yet varied depending on the data source. GPT-2 performed best on Reddit, BERT on the Google Play Store, and BART on Twitter. Our study provides an in-depth view of inclusiveness-related user feedback from most popular apps and online sources. We provide implications and recommendations that can be used to bridge the gap between user expectations and software so that software developers can resonate with the varied and evolving needs of the wide spectrum of users.

塊 · Learning · GROUP · Performer · Better ·

2023 年 11 月 1 日

Measuring the Impact of Distractors on Student Learning Gains while Using Proof Blocks

Seth Poulsen,Hongxuan Chen,Yael Gertner,Benjamin Cosman,Matthew West,Geoffrey L Herman

from arxiv, arXiv admin note: text overlap with arXiv:2211.09609

Background: Proof Blocks is a software tool that enables students to construct proofs by assembling prewritten lines and gives them automated feedback. Prior work on learning gains from Proof Blocks has focused on comparing learning gains from Proof Blocks against other learning activities such as writing proofs or reading. Purpose: The study described in this paper aims to compare learning gains from different variations of Proof Blocks. Specifically, we attempt to quantify the difference in learning gains for students who complete Proof Blocks problems with and without distractors. Methods: We conducted a randomized controlled trial with three experimental groups: a control group that completed an off-topic Proof Blocks activity, one that completed a \tool{} activity without distractors, and one that completed a Proof Blocks activity with distractors. All three groups read a book chapter on proof by induction before completing their activity. Findings: The group that completed the Proof Blocks activity with distractors performed better on the posttest than the group that completed the Proof Blocks without distractors, who in turn performed better than the group that completed the off-topic Proof Blocks activity. However, none of these differences were statistically significant. While the results of this study are inconclusive, we hope that it can serve as a foundation for future work.

RE · Analysis · RE · Engineering · Processing（編程語言） ·

2023 年 11 月 1 日

Advancing Requirements Engineering through Generative AI: Assessing the Role of LLMs

Chetan Arora,John Grundy,Mohamed Abdelrazek

Requirements Engineering (RE) is a critical phase in software development including the elicitation, analysis, specification, and validation of software requirements. Despite the importance of RE, it remains a challenging process due to the complexities of communication, uncertainty in the early stages and inadequate automation support. In recent years, large-language models (LLMs) have shown significant promise in diverse domains, including natural language processing, code generation, and program understanding. This chapter explores the potential of LLMs in driving RE processes, aiming to improve the efficiency and accuracy of requirements-related tasks. We propose key directions and SWOT analysis for research and development in using LLMs for RE, focusing on the potential for requirements elicitation, analysis, specification, and validation. We further present the results from a preliminary evaluation, in this context.

估計/估計量 · 目標領域 · Learning · 統計量 · 可理解性 ·

2023 年 10 月 31 日

Text-Transport: Toward Learning Causal Effects of Natural Language

Victoria Lin,Louis-Philippe Morency,Eli Ben-Michael

from arxiv, Accepted to EMNLP 2023

As language technologies gain prominence in real-world settings, it is important to understand how changes to language affect reader perceptions. This can be formalized as the causal effect of varying a linguistic attribute (e.g., sentiment) on a reader's response to the text. In this paper, we introduce Text-Transport, a method for estimation of causal effects from natural language under any text distribution. Current approaches for valid causal effect estimation require strong assumptions about the data, meaning the data from which one can estimate valid causal effects often is not representative of the actual target domain of interest. To address this issue, we leverage the notion of distribution shift to describe an estimator that transports causal effects between domains, bypassing the need for strong assumptions in the target domain. We derive statistical guarantees on the uncertainty of this estimator, and we report empirical results and analyses that support the validity of Text-Transport across data settings. Finally, we use Text-Transport to study a realistic setting--hate speech on social media--in which causal effects do shift significantly between text domains, demonstrating the necessity of transport when conducting causal inference on natural language.

MoDELS · 可行 · 代碼 · Backbone · Processing（編程語言） ·

2023 年 10 月 31 日

On Extracting Specialized Code Abilities from Large Language Models: A Feasibility Study

Zongjie Li,Chaozheng Wang,Pingchuan Ma,Chaowei Liu,Shuai Wang,Daoyuan Wu,Cuiyun Gao,Yang Liu

from arxiv, 13 pages

Recent advances in large language models (LLMs) significantly boost their usage in software engineering. However, training a well-performing LLM demands a substantial workforce for data collection and annotation. Moreover, training datasets may be proprietary or partially open, and the process often requires a costly GPU cluster. The intellectual property value of commercial LLMs makes them attractive targets for imitation attacks, but creating an imitation model with comparable parameters still incurs high costs. This motivates us to explore a practical and novel direction: slicing commercial black-box LLMs using medium-sized backbone models. In this paper, we explore the feasibility of launching imitation attacks on LLMs to extract their specialized code abilities, such as"code synthesis" and "code translation." We systematically investigate the effectiveness of launching code ability extraction attacks under different code-related tasks with multiple query schemes, including zero-shot, in-context, and Chain-of-Thought. We also design response checks to refine the outputs, leading to an effective imitation training process. Our results show promising outcomes, demonstrating that with a reasonable number of queries, attackers can train a medium-sized backbone model to replicate specialized code behaviors similar to the target LLMs. We summarize our findings and insights to help researchers better understand the threats posed by imitation attacks, including revealing a practical attack surface for generating adversarial code examples against LLMs.

WEB · 語言模型化 · MoDELS · Prompt · INTERACT ·

2023 年 10 月 31 日

AllTogether: Investigating the Efficacy of Spliced Prompt for Web Navigation using Large Language Models

Jiarun Liu,Wentao Hu,Chunhong Zhang

from arxiv, Include wrong information in comment. Should be 7 pages and not published yet

Large Language Models (LLMs) have emerged as promising agents for web navigation tasks, interpreting objectives and interacting with web pages. However, the efficiency of spliced prompts for such tasks remains underexplored. We introduces AllTogether, a standardized prompt template that enhances task context representation, thereby improving LLMs' performance in HTML-based web navigation. We evaluate the efficacy of this approach through prompt learning and instruction finetuning based on open-source Llama-2 and API-accessible GPT models. Our results reveal that models like GPT-4 outperform smaller models in web navigation tasks. Additionally, we find that the length of HTML snippet and history trajectory significantly influence performance, and prior step-by-step instructions prove less effective than real-time environmental feedback. Overall, we believe our work provides valuable insights for future research in LLM-driven web agents.

MIMO · 層 · Analysis · 錯誤率 · 塊 ·

2023 年 10 月 31 日

Multi-Domain Polarization for Enhancing the Physical Layer Security of MIMO Systems

Luping Xiang,Yao Zeng,Jie Hu,Kun Yang,Lajos Hanzo

A novel Physical Layer Security (PLS) framework is conceived for enhancing the security of the wireless communication systems by exploiting multi-domain polarization in Multiple-Input Multiple-Output (MIMO) systems. We design a sophisticated key generation scheme based on multi-domain polarization, and the corresponding receivers. An in-depth analysis of the system's secrecy rate is provided, demonstrating the confidentiality of our approach in the presence of eavesdroppers having strong computational capabilities. More explicitly, our simulation results and theoretical analysis corroborate the advantages of the proposed scheme in terms of its bit error rate (BER), block error rate (BLER), and maximum achievable secrecy rate. Our findings indicate that the innovative PLS framework effectively enhances the security and reliability of wireless communication systems. For instance, in a $4\times4$ MIMO setup, the proposed PLS strategy exhibits an improvement of $2$dB compared to conventional MIMO, systems at a BLER of $2\cdot 10^{-5}$ while the eavesdropper's BLER reaches $1$.

Learning · Readability · 代碼 · Automator · 可辨認的 ·

2023 年 10 月 30 日

LILO: Learning Interpretable Libraries by Compressing and Documenting Code

Gabriel Grand,Lionel Wong,Matthew Bowers,Theo X. Olausson,Muxin Liu,Joshua B. Tenenbaum,Jacob Andreas

While large language models (LLMs) now excel at code generation, a key aspect of software development is the art of refactoring: consolidating code into libraries of reusable and readable programs. In this paper, we introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code to build libraries tailored to particular problem domains. LILO combines LLM-guided program synthesis with recent algorithmic advances in automated refactoring from Stitch: a symbolic compression system that efficiently identifies optimal lambda abstractions across large code corpora. To make these abstractions interpretable, we introduce an auto-documentation (AutoDoc) procedure that infers natural language names and docstrings based on contextual examples of usage. In addition to improving human readability, we find that AutoDoc boosts performance by helping LILO's synthesizer to interpret and deploy learned abstractions. We evaluate LILO on three inductive program synthesis benchmarks for string editing, scene reasoning, and graphics composition. Compared to existing neural and symbolic methods - including the state-of-the-art library learning algorithm DreamCoder - LILO solves more complex tasks and learns richer libraries that are grounded in linguistic knowledge.

原點 · MoDELS · Guidance · 優化器 · 示例 ·

2023 年 10 月 30 日

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI

Bochuan Cao,Changjiang Li,Ting Wang,Jinyuan Jia,Bo Li,Jinghui Chen

from arxiv, 21 pages, 11 figures, 9 tables. Accepted by NeurIPS 2023

Diffusion-based image generation models, such as Stable Diffusion or DALL-E 2, are able to learn from given images and generate high-quality samples following the guidance from prompts. For instance, they can be used to create artistic images that mimic the style of an artist based on his/her original artworks or to maliciously edit the original images for fake content. However, such ability also brings serious ethical issues without proper authorization from the owner of the original images. In response, several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations, which are designed to mislead the diffusion model and make it unable to properly generate new samples. In this work, we introduce a perturbation purification platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure. IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e.g., style mimicking, malicious editing). The proposed IMPRESS platform offers a comprehensive evaluation of several contemporary protection methods, and can be used as an evaluation platform for future protection methods.

任務對話系統 · 得分 · Better · 估計/估計量 · 相關系數 ·

2019 年 11 月 4 日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Sarik Ghazarian,Ralph Weischedel,Aram Galstyan,Nanyun Peng

User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, {\em predictive engagement}, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can improve automatic evaluation metrics for open-domain dialogue systems, as shown by correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.