青柠在线观看免费高清1,99视频在线播放喷射,国模私拍视频一区二区

We consider an extension of the classical Total Store Order (TSO) semantics by expanding it to turn-based 2-player safety games. During her turn, a player can select any of the communicating processes and perform its next transition. We consider different formulations of the safety game problem depending on whether one player or both of them transfer messages from the process buffers to the shared memory. We give the complete decidability picture for all the possible alternatives.

相關內容

LMCS

關注 0

計算機科學中的邏輯方法是一種完全開放存取的、免費的電子期刊。委員會歡迎就涉及廣義邏輯方法的計算機科學理論和實踐領域發表的論文。論文以傳統的方式進行評審，每張論文有兩個或更多的評審。版權歸作者所有。官網鏈接： · Taxonomy · 可辨認的 · 張成子空間 · CASE ·

2024 年 4 月 23 日

Super Mario in the Pernicious Kingdoms: Classifying glitches in old games

Llewellyn Forward,Io Limmer,Joseph Hallett,Dan Page

from arxiv, Presented at the 8th International Workshop on Games and Software Engineering (GAS), April 14 2024. Co-located with ICSE

In a case study spanning four classic Super Mario games and the analysis of 237 known glitches within them, we classify a variety of weaknesses that are exploited by speedrunners to enable them to beat games quickly and in surprising ways. Using the Seven Pernicious Kingdoms software defect taxonomy and the Common Weakness Enumeration, we categorize the glitches by the weaknesses that enable them. We identify 7 new weaknesses that appear specific to games and which are not covered by current software weakness taxonomies.

可理解性 · INTERACT · prototype · 設計 · 知識 (knowledge) ·

2024 年 4 月 22 日

Designing forecasting software for forecast users: Empowering non-experts to create and understand their own forecasts

Richard Stromer,Oskar Triebe,Chad Zanocco,Ram Rajagopal

from arxiv, 10 pages

Forecasts inform decision-making in nearly every domain. Forecasts are often produced by experts with rare or hard to acquire skills. In practice, forecasts are often used by domain experts and managers with little forecasting expertise. Our study focuses on how to design forecasting software that empowers non-expert users. We study how users can make use of state-of-the-art forecasting methods, embed their domain knowledge, and how they build understanding and trust towards generated forecasts. To do so, we co-designed a forecasting software prototype using feedback from users and then analyzed their interactions with our prototype. Our results identified three main considerations for non-expert users: (1) a safe stepwise approach facilitating causal understanding and trust; (2) a white box model supporting human-reasoning-friendly components; (3) the inclusion of domain knowledge. This paper contributes insights into how non-expert users interact with forecasting software and by recommending ways to design more accessible forecasting software.

估計/估計量 · 全 · CASE · 得分 · 樣本 ·

2024 年 4 月 22 日

On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates

Stefano Bruno,Ying Zhang,Dong-Young Lim,?mer Deniz Akyildiz,Sotirios Sabanis

We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly log-concave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of our approach. In this case, explicit estimates are provided for the associated optimization problem, i.e. score approximation, while these are combined with the corresponding sampling estimates. As a result, we obtain the best known upper bound estimates in terms of key quantities of interest, such as the dimension and rates of convergence, for the Wasserstein-2 distance between the data distribution (Gaussian with unknown mean) and our sampling algorithm. Beyond the motivating example and in order to allow for the use of a diverse range of stochastic optimizers, we present our results using an $L^2$-accurate score estimation assumption, which crucially is formed under an expectation with respect to the stochastic optimizer and our novel auxiliary process that uses only known information. This approach yields the best known convergence rate for our sampling algorithm.

穩健性 · Performer · 語言模型化 · 自動問答 · 泛函 ·

2024 年 4 月 22 日

Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations

Sukmin Cho,Soyeong Jeong,Jeongyeon Seo,Taeho Hwang,Jong C. Park

from arxiv, Under Review

The robustness of recent Large Language Models (LLMs) has become increasingly crucial as their applicability expands across various domains and real-world applications. Retrieval-Augmented Generation (RAG) is a promising solution for addressing the limitations of LLMs, yet existing studies on the robustness of RAG often overlook the interconnected relationships between RAG components or the potential threats prevalent in real-world databases, such as minor textual errors. In this work, we investigate two underexplored aspects when assessing the robustness of RAG: 1) vulnerability to noisy documents through low-level perturbations and 2) a holistic evaluation of RAG robustness. Furthermore, we introduce a novel attack method, the Genetic Attack on RAG (\textit{GARAG}), which targets these aspects. Specifically, GARAG is designed to reveal vulnerabilities within each component and test the overall system functionality against noisy documents. We validate RAG robustness by applying our \textit{GARAG} to standard QA datasets, incorporating diverse retrievers and LLMs. The experimental results show that GARAG consistently achieves high attack success rates. Also, it significantly devastates the performance of each component and their synergy, highlighting the substantial risk that minor textual inaccuracies pose in disrupting RAG systems in the real world.

INFORMS · 知識 (knowledge) · ASSETS · 大語言模型 · 會議 ·

2024 年 4 月 21 日

Modal-adaptive Knowledge-enhanced Graph-based Financial Prediction from Monetary Policy Conference Calls with LLM

Kun Ouyang,Yi Liu,Shicheng Li,Ruihan Bao,Keiko Harimoto,Xu Sun

from arxiv, Accepted by LREC Coling 2024 -FinNLP (oral)

Financial prediction from Monetary Policy Conference (MPC) calls is a new yet challenging task, which targets at predicting the price movement and volatility for specific financial assets by analyzing multimodal information including text, video, and audio. Although the existing work has achieved great success using cross-modal transformer blocks, it overlooks the potential external financial knowledge, the varying contributions of different modalities to financial prediction, as well as the innate relations among different financial assets. To tackle these limitations, we propose a novel Modal-Adaptive kNowledge-enhAnced Graph-basEd financial pRediction scheme, named MANAGER. Specifically, MANAGER resorts to FinDKG to obtain the external related knowledge for the input text. Meanwhile, MANAGER adopts BEiT-3 and Hidden-unit BERT (HuBERT) to extract the video and audio features, respectively. Thereafter, MANAGER introduces a novel knowledge-enhanced cross-modal graph that fully characterizes the semantic relations among text, external knowledge, video and audio, to adaptively utilize the information in different modalities, with ChatGLM2 as the backbone. Extensive experiments on a publicly available dataset Monopoly verify the superiority of our model over cutting-edge methods.

HTTPS · Automator · state-of-the-art · Processing（編程語言） · 基 ·

2024 年 4 月 19 日

scenario.center: Methods from Real-world Data to a Scenario Database

Michael Schuldes,Christoph Glasmacher,Lutz Eckstein

from arxiv, 8 pages, 9 figures, 2 tables; Accepted to be published as part of the 35th IEEE Intelligent Vehicles Symposium, June 2 - 5, 2024, Korea

Scenario-based testing is a promising method to develop, verify and validate automated driving systems (ADS) since pure on-road testing seems inefficient for complex traffic environments. A major challenge for this approach is the provision and management of a sufficient number of scenarios to test a system. The provision, generation, and management of scenario at scale is investigated in current research. This paper presents the scenario database scenario.center ( //scenario.center ) to process and manage scenario data covering the needs of scenario-based testing approaches comprehensively and automatically. Thereby, requirements for such databases are described. Based on those, a four-step approach is proposed. Firstly, a common input format with defined quality requirements is defined. This is utilized for detecting events and base scenarios automatically. Furthermore, methods for searchability, evaluation of data quality and different scenario generation methods are proposed to allow a broad applicability serving different needs. For evaluation, the methodology is compared to state-of-the-art scenario databases. Finally, the application and capabilities of the database are shown by applying the methodology to the inD dataset. A public demonstration of the database interface is provided at //scenario.center .

MoDELS · Boosting（一種模型訓練加速方式） · Performer · 不變 · 穩健性 ·

2024 年 4 月 18 日

Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models

Shouwei Ruan,Yinpeng Dong,Hanqing Liu,Yao Huang,Hang Su,Xingxing Wei

from arxiv, 20 pages

Vision-Language Pre-training (VLP) models like CLIP have achieved remarkable success in computer vision and particularly demonstrated superior robustness to distribution shifts of 2D images. However, their robustness under 3D viewpoint variations is still limited, which can hinder the development for real-world applications. This paper successfully addresses this concern while keeping VLPs' original performance by breaking through two primary obstacles: 1) the scarcity of training data and 2) the suboptimal fine-tuning paradigms. To combat data scarcity, we build the Multi-View Caption (MVCap) dataset -- a comprehensive collection of over four million multi-view image-text pairs across more than 100K objects, providing more potential for VLP models to develop generalizable viewpoint-invariant representations. To address the limitations of existing paradigms in performance trade-offs and training efficiency, we design a novel fine-tuning framework named Omniview-Tuning (OVT). Specifically, OVT introduces a Cross-Viewpoint Alignment objective through a minimax-like optimization strategy, which effectively aligns representations of identical objects from diverse viewpoints without causing overfitting. Additionally, OVT fine-tunes VLP models in a parameter-efficient manner, leading to minimal computational cost. Extensive experiments on various VLP models with different architectures validate that OVT significantly improves the models' resilience to viewpoint shifts and keeps the original performance, establishing a pioneering standard for boosting the viewpoint invariance of VLP models.

Analysis · Agent · Processing（編程語言） · Performer · Seven ·

2024 年 4 月 18 日

mABC: multi-Agent Blockchain-Inspired Collaboration for root cause analysis in micro-services architecture

Wei Zhang,Hongcheng Guo,Jian Yang,Yi Zhang,Chaoran Yan,Zhoujin Tian,Hangyuan Ji,Zhoujun Li,Tongliang Li,Tieqiao Zheng,Chao Chen,Yi Liang,Xu Shi,Liangfan Zheng,Bo Zhang

The escalating complexity of micro-services architecture in cloud-native technologies poses significant challenges for maintaining system stability and efficiency. To conduct root cause analysis (RCA) and resolution of alert events, we propose a pioneering framework, multi-Agent Blockchain-inspired Collaboration for root cause analysis in micro-services architecture (mABC), to revolutionize the AI for IT operations (AIOps) domain, where multiple agents based on the powerful large language models (LLMs) perform blockchain-inspired voting to reach a final agreement following a standardized process for processing tasks and queries provided by Agent Workflow. Specifically, seven specialized agents derived from Agent Workflow each provide valuable insights towards root cause analysis based on their expertise and the intrinsic software knowledge of LLMs collaborating within a decentralized chain. To avoid potential instability issues in LLMs and fully leverage the transparent and egalitarian advantages inherent in a decentralized structure, mABC adopts a decision-making process inspired by blockchain governance principles while considering the contribution index and expertise index of each agent. Experimental results on the public benchmark AIOps challenge dataset and our created train-ticket dataset demonstrate superior performance in accurately identifying root causes and formulating effective solutions, compared to previous strong baselines. The ablation study further highlights the significance of each component within mABC, with Agent Workflow, multi-agent, and blockchain-inspired voting being crucial for achieving optimal performance. mABC offers a comprehensive automated root cause analysis and resolution in micro-services architecture and achieves a significant improvement in the AIOps domain compared to existing baselines

數據集 · INFORMS · 張成子空間 · 模型評估 · Pivotal（公司） ·

2024 年 4 月 18 日

emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information

Jimenez Eladio,Hao Wu

from arxiv, The dataset is available in //huggingface.co/datasets/Eladio/emrqa-msquad

Machine Reading Comprehension (MRC) holds a pivotal role in shaping Medical Question Answering Systems (QAS) and transforming the landscape of accessing and applying medical information. However, the inherent challenges in the medical field, such as complex terminology and question ambiguity, necessitate innovative solutions. One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research. To address the intricacies of medical terminology, a specialized dataset was integrated, exemplified by a novel Span extraction dataset derived from emrQA but restructured into 163,695 questions and 4,136 manually obtained answers, this new dataset was called emrQA-msquad dataset. Additionally, for ambiguous questions, a dedicated medical dataset for the Span extraction task was introduced, reinforcing the system's robustness. The fine-tuning of models such as BERT, RoBERTa, and Tiny RoBERTa for medical contexts significantly improved response accuracy within the F1-score range of 0.75 to 1.00 from 10.1% to 37.4%, 18.7% to 44.7% and 16.0% to 46.8%, respectively. Finally, emrQA-msquad dataset is publicy available at //huggingface.co/datasets/Eladio/emrqa-msquad.

TSC · Performer · 學成 · state-of-the-art · 深度學習 ·

2019 年 3 月 14 日

Deep learning for time series classification: a review

H. Ismail Fawaz,G. Forestier,J. Weber,L. Idoumghar,P. Muller

from arxiv, Accepted at Data Mining and Knowledge Discovery

Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.