亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Experimentation is an essential method for causal inference in any empirical discipline. Crossover-design experiments are common in Software Engineering (SE) research. In these, subjects apply more than one treatment in different orders. This design increases the amount of obtained data and deals with subject variability but introduces threats to internal validity like the learning and carryover effect. Vegas et al. reviewed the state of practice for crossover designs in SE research and provided guidelines on how to address its threats during data analysis while still harnessing its benefits. In this paper, we reflect on the impact of these guidelines and review the state of analysis of crossover design experiments in SE publications between 2015 and March 2024. To this end, by conducting a forward snowballing of the guidelines, we survey 136 publications reporting 67 crossover-design experiments and evaluate their data analysis against the provided guidelines. The results show that the validity of data analyses has improved compared to the original state of analysis. Still, despite the explicit guidelines, only 29.5% of all threats to validity were addressed properly. While the maturation and the optimal sequence threats are properly addressed in 35.8% and 38.8% of all studies in our sample respectively, the carryover threat is only modeled in about 3% of the observed cases. The lack of adherence to the analysis guidelines threatens the validity of the conclusions drawn from crossover design experiments

相關內容

Recent advances in diffusion models have significantly improved text-to-image (T2I) generation, but they often struggle to balance fine-grained precision with high-level control. Methods like ControlNet and T2I-Adapter excel at following sketches by seasoned artists but tend to be overly rigid, replicating unintentional flaws in sketches from novice users. Meanwhile, coarse-grained methods, such as sketch-based abstraction frameworks, offer more accessible input handling but lack the precise control needed for detailed, professional use. To address these limitations, we propose KnobGen, a dual-pathway framework that democratizes sketch-based image generation by seamlessly adapting to varying levels of sketch complexity and user skill. KnobGen uses a Coarse-Grained Controller (CGC) module for high-level semantics and a Fine-Grained Controller (FGC) module for detailed refinement. The relative strength of these two modules can be adjusted through our knob inference mechanism to align with the user's specific needs. These mechanisms ensure that KnobGen can flexibly generate images from both novice sketches and those drawn by seasoned artists. This maintains control over the final output while preserving the natural appearance of the image, as evidenced on the MultiGen-20M dataset and a newly collected sketch dataset.

Prior research has revealed that certain abstract concepts are linearly represented as directions in the representation space of LLMs, predominantly centered around English. In this paper, we extend this investigation to a multilingual context, with a specific focus on human values-related concepts (i.e., value concepts) due to their significance for AI safety. Through our comprehensive exploration covering 7 types of human values, 16 languages and 3 LLM series with distinct multilinguality (e.g., monolingual, bilingual and multilingual), we first empirically confirm the presence of value concepts within LLMs in a multilingual format. Further analysis on the cross-lingual characteristics of these concepts reveals 3 traits arising from language resource disparities: cross-lingual inconsistency, distorted linguistic relationships, and unidirectional cross-lingual transfer between high- and low-resource languages, all in terms of value concepts. Moreover, we validate the feasibility of cross-lingual control over value alignment capabilities of LLMs, leveraging the dominant language as a source language. Ultimately, recognizing the significant impact of LLMs' multilinguality on our results, we consolidate our findings and provide prudent suggestions on the composition of multilingual data for LLMs pre-training.

Learning is a key motivator behind information search behavior. With the emergence of LLM-based chatbots, students are increasingly turning to these tools as their primary resource for acquiring knowledge. However, the transition from traditional resources like textbooks and web searches raises concerns among educators. They worry that these fully-automated LLMs might lead students to delegate critical steps of search as learning. In this paper, we systematically uncover three main concerns from educators' perspectives. In response to these concerns, we conducted a mixed-methods study with 92 university students to compare three learning sources with different automation levels. Our results show that LLMs support comprehensive understanding of key concepts without promoting passive learning, though their effectiveness in knowledge retention was limited. Additionally, we found that academic performance impacted both learning outcomes and search patterns. Notably, higher-competence learners engaged more deeply with content through reading-intensive behaviors rather than relying on search activities.

Diffusion models have revolted the field of text-to-image generation recently. The unique way of fusing text and image information contributes to their remarkable capability of generating highly text-related images. From another perspective, these generative models imply clues about the precise correlation between words and pixels. In this work, a simple but effective method is proposed to utilize the attention mechanism in the denoising network of text-to-image diffusion models. Without re-training nor inference-time optimization, the semantic grounding of phrases can be attained directly. We evaluate our method on Pascal VOC 2012 and Microsoft COCO 2014 under weakly-supervised semantic segmentation setting and our method achieves superior performance to prior methods. In addition, the acquired word-pixel correlation is found to be generalizable for the learned text embedding of customized generation methods, requiring only a few modifications. To validate our discovery, we introduce a new practical task called "personalized referring image segmentation" with a new dataset. Experiments in various situations demonstrate the advantages of our method compared to strong baselines on this task. In summary, our work reveals a novel way to extract the rich multi-modal knowledge hidden in diffusion models for segmentation.

A program's exceptional behavior can substantially complicate its control flow, and hence accurately reasoning about the program's correctness. On the other hand, formally verifying realistic programs is likely to involve exceptions -- a ubiquitous feature in modern programming languages. In this paper, we present a novel approach to verify the exceptional behavior of Java programs, which extends our previous work on ByteBack. ByteBack works on a program's bytecode, while providing means to specify the intended behavior at the source-code level; this approach sets ByteBack apart from most state-of-the-art verifiers that target source code. To explicitly model a program's exceptional behavior in a way that is amenable to formal reasoning, we introduce Vimp: a high-level bytecode representation that extends the Soot framework's Grimp with verification-oriented features, thus serving as an intermediate layer between bytecode and the Boogie intermediate verification language. Working on bytecode through this intermediate layer brings flexibility and adaptability to new language versions and variants: as our experiments demonstrate, ByteBack can verify programs involving exceptional behavior in all versions of Java, as well as in Scala and Kotlin (two other popular JVM languages).

ABCDE is a sophisticated technique for evaluating differences between very large clusterings. Its main metric that characterizes the magnitude of the difference between two clusterings is the JaccardDistance, which is a true distance metric in the space of all clusterings of a fixed set of (weighted) items. The JaccardIndex is the complementary metric that characterizes the similarity of two clusterings. Its relationship with the JaccardDistance is simple: JaccardDistance + JaccardIndex = 1. This paper decomposes the JaccardDistance and the JaccardIndex further. In each case, the decomposition yields Impact and Quality metrics. The Impact metrics measure aspects of the magnitude of the clustering diff, while Quality metrics use human judgements to measure how much the clustering diff improves the quality of the clustering. The decompositions of this paper offer more and deeper insight into a clustering change. They also unlock new techniques for debugging and exploring the nature of the clustering diff. The new metrics are mathematically well-behaved and they are interrelated via simple equations. While the work can be seen as an alternative formal framework for ABCDE, we prefer to view it as complementary. It certainly offers a different perspective on the magnitude and the quality of a clustering change, and users can use whatever they want from each approach to gain more insight into a change.

Investing in an Information Security Management System (ISMS) enhances organizational competitiveness and protects information assets. However, introducing an ISMS consumes significant resources; for instance, implementing an ISMS according to the ISO27001 standard involves documenting 116 different controls. This paper discusses how Kempower, a Finnish company, has effectively used generative AI to create and implement an ISMS, significantly reducing the resources required. This research studies how the use of generative AI can enhance the process of creating an ISMS. We conducted seven semi-structured interviews held with various stakeholders of the ISMS project, who had varying levels experience in cyber security and AI.

The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, and common-sense reasoning, etc. Such a major leap-forward in general AI capacity will change the pattern of how personalization is conducted. For one thing, it will reform the way of interaction between humans and personalization systems. Instead of being a passive medium of information filtering, large language models present the foundation for active user engagement. On top of such a new foundation, user requests can be proactively explored, and user's required information can be delivered in a natural and explainable way. For another thing, it will also considerably expand the scope of personalization, making it grow from the sole function of collecting personalized information to the compound function of providing personalized services. By leveraging large language models as general-purpose interface, the personalization systems may compile user requests into plans, calls the functions of external tools to execute the plans, and integrate the tools' outputs to complete the end-to-end personalization tasks. Today, large language models are still being developed, whereas the application in personalization is largely unexplored. Therefore, we consider it to be the right time to review the challenges in personalization and the opportunities to address them with LLMs. In particular, we dedicate this perspective paper to the discussion of the following aspects: the development and challenges for the existing personalization system, the newly emerged capabilities of large language models, and the potential ways of making use of large language models for personalization.

Knowledge graph embedding (KGE) is a increasingly popular technique that aims to represent entities and relations of knowledge graphs into low-dimensional semantic spaces for a wide spectrum of applications such as link prediction, knowledge reasoning and knowledge completion. In this paper, we provide a systematic review of existing KGE techniques based on representation spaces. Particularly, we build a fine-grained classification to categorise the models based on three mathematical perspectives of the representation spaces: (1) Algebraic perspective, (2) Geometric perspective, and (3) Analytical perspective. We introduce the rigorous definitions of fundamental mathematical spaces before diving into KGE models and their mathematical properties. We further discuss different KGE methods over the three categories, as well as summarise how spatial advantages work over different embedding needs. By collating the experimental results from downstream tasks, we also explore the advantages of mathematical space in different scenarios and the reasons behind them. We further state some promising research directions from a representation space perspective, with which we hope to inspire researchers to design their KGE models as well as their related applications with more consideration of their mathematical space properties.

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

北京阿比特科技有限公司