姑娘日本电影免费观看全集中文_女人让男人桶爽在线观看_天天摸夜夜添狠狠添婷婷_精品精品国产高清A_91久久精品美女高潮喷水白_巨胸美乳无码人妻视频_日韩成年人AU高清无码

Astrophysical simulations are computation, memory, and thus energy intensive, thereby requiring new hardware advances for progress. Stony Brook University recently expanded its computing cluster "SeaWulf" with an addition of 94 new nodes featuring Intel Sapphire Rapids Xeon Max series CPUs. We present a performance and power efficiency study of this hardware performed with FLASH: a multi-scale, multi-physics, adaptive mesh-based software instrument. We extend this study to compare performance to that of Stony Brook's Ookami testbed which features ARM-based A64FX-700 processors, and SeaWulf's AMD EPYC Milan and Intel Skylake nodes. Our application is a stellar explosion known as a thermonuclear (Type Ia) supernova and for this 3D problem, FLASH includes operators for hydrodynamics, gravity, and nuclear burning, in addition to routines for the material equation of state. We perform a strong-scaling study with a 220 GB problem size to explore both single- and multi-node performance. Our study explores the performance of different MPI mappings and the distribution of processors across nodes. From these tests, we determined the optimal configuration to balance runtime and energy consumption for our application.

相關內容

Performer

關注 10

設計 · SimPLe · 圖 · 數據可視化 · Processing（編程語言） ·

2024 年 10 月 31 日

MAVIL: Design of a Multidimensional Assessment of Visual Data Literacy and its Application in a Representative Survey

Antonia Saske,Torsten M?ller,Laura Koesten,Judith Staudner,Sylvia Kritzinger

from arxiv, 9 pages, 8 figures, 1 table

The ability to read, interpret, and critique data visualizations has mainly been assessed using data visualization tasks like value retrieval. Although evidence on different facets of Visual Data Literacy (VDL) is well established in visualization research and includes numeracy, graph familiarity, or aesthetic elements, they have not been sufficiently considered in ability assessments. Here, VDL is considered a multidimensional ability whose facets can be partially self-assessed. We introduce an assessment in which VDL is deconstructed as a process of understanding, in reference to frameworks from the learning sciences. MAVIL, Multidimensional Assessment of Visual Data Literacy, is composed of six ability dimensions: General Impression/Abstract Thinking, Graph Elements/Familiarity, Aesthetic Perception, Visualization Criticism, Data Reading Tasks and Numeracy/Topic Knowledge. MAVIL was designed for general audiences and implemented in a survey (n=438), representative of Austria's age groups (18-74 years) and gender split. The survey mirrors the population's VDL and shows the perception of two climate data visualizations, a line and bar chart. We found that $48\%$ of respondents make mistakes with the simple charts, while $5\%$ believe that they cannot summarize the visualization content. About a quarter have deficits in comprehending simple data units, and $19-20\%$ are unfamiliar with each displayed chart type.

層 · 隱狀態 · Attention · 詞元分析器 · MoDELS ·

2024 年 10 月 31 日

Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers

Amit Ben-Artzy,Roy Schwartz

In decoder-based LLMs, the representation of a given layer serves two purposes: as input to the next layer during the computation of the current token; and as input to the attention mechanism of future tokens. In this work, we show that the importance of the latter role might be overestimated. To show that, we start by manipulating the representations of previous tokens; e.g. by replacing the hidden states at some layer k with random vectors. Our experimenting with four LLMs and four tasks show that this operation often leads to small to negligible drop in performance. Importantly, this happens if the manipulation occurs in the top part of the model-k is in the final 30-50% of the layers. In contrast, doing the same manipulation in earlier layers might lead to chance level performance. We continue by switching the hidden state of certain tokens with hidden states of other tokens from another prompt; e.g., replacing the word "Italy" with "France" in "What is the capital of Italy?". We find that when applying this switch in the top 1/3 of the model, the model ignores it (answering "Rome"). However if we apply it before, the model conforms to the switch ("Paris"). Our results hint at a two stage process in transformer-based LLMs: the first part gathers input from previous tokens, while the second mainly processes that information internally.

Vision · MoDELS · 可理解性 · Next · 詞元分析器 ·

2024 年 10 月 30 日

Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective

Shenghao Xie,Wenqiang Zu,Mingyang Zhao,Duo Su,Shilong Liu,Ruohua Shi,Guoqi Li,Shanghang Zhang,Lei Ma

from arxiv, 17 pages, 1 table, 2 figures

Autoregression in large language models (LLMs) has shown impressive scalability by unifying all language tasks into the next token prediction paradigm. Recently, there is a growing interest in extending this success to vision foundation models. In this survey, we review the recent advances and discuss future directions for autoregressive vision foundation models. First, we present the trend for next generation of vision foundation models, i.e., unifying both understanding and generation in vision tasks. We then analyze the limitations of existing vision foundation models, and present a formal definition of autoregression with its advantages. Later, we categorize autoregressive vision foundation models from their vision tokenizers and autoregression backbones. Finally, we discuss several promising research challenges and directions. To the best of our knowledge, this is the first survey to comprehensively summarize autoregressive vision foundation models under the trend of unifying understanding and generation. A collection of related resources is available at //github.com/EmmaSRH/ARVFM.

Attention · Tensor · MoDELS · 全 · Processing（編程語言） ·

2024 年 10 月 28 日

Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning

Aosong Feng,Rex Ying,Leandros Tassiulas

As the demand for processing extended textual data grows, the ability to handle long-range dependencies and maintain computational efficiency is more critical than ever. One of the key issues for long-sequence modeling using attention-based model is the mismatch between the limited-range modeling power of full attention and the long-range token dependency in the input sequence. In this work, we propose to scale up the attention receptive field by tensorizing long input sequences into compact tensor representations followed by attention on each transformed dimension. The resulting Tensorized Attention can be adopted as efficient transformer backbones to extend input context length with improved memory and time efficiency. We show that the proposed attention tensorization encodes token dependencies as a multi-hop attention process, and is equivalent to Kronecker decomposition of full attention. Extensive experiments show that tensorized attention can be used to adapt pretrained LLMs with improved efficiency. Notably, Llama-8B with tensorization is trained under 32,768 context length and can steadily extrapolate to 128k length during inference with $11\times$ speedup, compared to full attention with FlashAttention-2.

MoDELS · TOOLS · 評論員 · 回合 · INTERACT ·

2024 年 10 月 26 日

Preparing for Super-Reactivity: Early Fault-Detection in the Development of Exceedingly Complex Reactive Systems

David Harel,Assaf Marron

We introduce the term Super-Reactive Systems to refer to reactive systems whose construction and behavior are complex, constantly changing and evolving, and heavily interwoven with other systems and the physical world. Finding hidden faults in such systems early in planning and development is critical for human safety, the environment, society and the economy. However, the complexity of the system and its interactions and the absence of adequate technical details pose a great obstacle. We propose an architecture for models and tools to overcome such barriers and enable simulation, systematic analysis, and fault detection and handling, early in the development of super-reactive systems. The approach is facilitated by the inference and abstraction capabilities and the power and knowledge afforded by large language models and associated AI tools. It is based on: (i) deferred, just-in-time interpretation of model elements that are stored in natural language form, and (ii) early capture of tacit interdependencies among seemingly orthogonal requirements.

Sphering · 縮放 · 輸出 · 大語言模型 · 代碼 ·

2024 年 10 月 21 日

SPHERE: Scaling Personalized Feedback in Programming Classrooms with Structured Review of LLM Outputs

Xiaohang Tang,Sam Wong,Marcus Huynh,Zicheng He,Yalong Yang,Yan Chen

Effective personalized feedback is crucial for learning programming. However, providing personalized, real-time feedback in large programming classrooms poses significant challenges for instructors. This paper introduces SPHERE, an interactive system that leverages Large Language Models (LLMs) and structured LLM output review to scale personalized feedback for in-class coding activities. SPHERE employs two key components: an Issue Recommendation Component that identifies critical patterns in students' code and discussion, and a Feedback Review Component that uses a ``strategy-detail-verify'' approach for efficient feedback creation and verification. An in-lab, between-subject study demonstrates SPHERE's effectiveness in improving feedback quality and the overall feedback review process compared to a baseline system using off-the-shelf LLM outputs. This work contributes a novel approach to scaling personalized feedback in programming education, addressing the challenges of real-time response, issue prioritization, and large-scale personalization.

Performer · 張成子空間 · 置信度 · 秩 · 覆蓋 ·

2024 年 10 月 19 日

Overcoming Common Flaws in the Evaluation of Selective Classification Systems

Jeremias Traub,Till J. Bungert,Carsten T. Lüth,Michael Baumgartner,Klaus H. Maier-Hein,Lena Maier-Hein,Paul F Jaeger

Selective Classification, wherein models can reject low-confidence predictions, promises reliable translation of machine-learning based classification systems to real-world scenarios such as clinical diagnostics. While current evaluation of these systems typically assumes fixed working points based on pre-defined rejection thresholds, methodological progress requires benchmarking the general performance of systems akin to the $\mathrm{AUROC}$ in standard classification. In this work, we define 5 requirements for multi-threshold metrics in selective classification regarding task alignment, interpretability, and flexibility, and show how current approaches fail to meet them. We propose the Area under the Generalized Risk Coverage curve ($\mathrm{AUGRC}$), which meets all requirements and can be directly interpreted as the average risk of undetected failures. We empirically demonstrate the relevance of $\mathrm{AUGRC}$ on a comprehensive benchmark spanning 6 data sets and 13 confidence scoring functions. We find that the proposed metric substantially changes metric rankings on 5 out of the 6 data sets.

標注 · Learning · MoDELS · 控制器 · 學習器 ·

2024 年 10 月 16 日

Learning to Predict Usage Options of Product Reviews with LLM-Generated Labels

Leo Kohlenberg,Leonard Horns,Frederic Sadrieh,Nils Kiele,Matthis Clausen,Konstantin Ketterer,Avetis Navasardyan,Tamara Czinczoll,Gerard de Melo,Ralf Herbrich

from arxiv, 9 pages

Annotating large datasets can be challenging. However, crowd-sourcing is often expensive and can lack quality, especially for non-trivial tasks. We propose a method of using LLMs as few-shot learners for annotating data in a complex natural language task where we learn a standalone model to predict usage options for products from customer reviews. We also propose a new evaluation metric for this scenario, HAMS4, that can be used to compare a set of strings with multiple reference sets. Learning a custom model offers individual control over energy efficiency and privacy measures compared to using the LLM directly for the sequence-to-sequence task. We compare this data annotation approach with other traditional methods and demonstrate how LLMs can enable considerable cost savings. We find that the quality of the resulting data exceeds the level attained by third-party vendor services and that GPT-4-generated labels even reach the level of domain experts. We make the code and generated labels publicly available.

Learning · Chatbot · AI · Continuity · ChatGPT ·

2024 年 10 月 10 日

Investigating Developers' Preferences for Learning and Issue Resolution Resources in the ChatGPT Era

Ahmad Tayeb,Mohammad D. Alahmadi,Elham Tajik,Sonia Haiduc

from arxiv, International Conference on Software Maintenance and Evolution (ICSME 2024)

The landscape of software developer learning resources has continuously evolved, with recent trends favoring engaging formats like video tutorials. The emergence of Large Language Models (LLMs) like ChatGPT presents a new learning paradigm. While existing research explores the potential of LLMs in software development and education, their impact on developers' learning and solution-seeking behavior remains unexplored. To address this gap, we conducted a survey targeting software developers and computer science students, gathering 341 responses, of which 268 were completed and analyzed. This study investigates how AI chatbots like ChatGPT have influenced developers' learning preferences when acquiring new skills, exploring technologies, and resolving programming issues. Through quantitative and qualitative analysis, we explore whether AI tools supplement or replace traditional learning resources such as video tutorials, written tutorials, and Q&A forums. Our findings reveal a nuanced view: while video tutorials continue to be highly preferred for their comprehensive coverage, a significant number of respondents view AI chatbots as potential replacements for written tutorials, underscoring a shift towards more interactive and personalized learning experiences. Additionally, AI chatbots are increasingly considered valuable supplements to video tutorials, indicating their growing role in the developers' learning resources. These insights offer valuable directions for educators and the software development community by shedding light on the evolving preferences toward learning resources in the era of ChatGPT.

Faster R-CNN · domain shift · R-CNN · 目標檢測 · 可約的 ·

2018 年 3 月 8 日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Yuhua Chen,Wen Li,Christos Sakaridis,Dengxin Dai,Luc Van Gool

from arxiv, Accepted to CVPR 2018

Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.