国产特级黄色片A级无毛视频,精品国产91久久久久久久下载,成人爽A毛片在线视频九色,自拍偷拍一区视频,99精品国产高清久久久

This paper introduces Alympics (Olympics for Agents), a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research. Alympics creates a versatile platform for studying complex game theory problems, bridging the gap between theoretical game theory and empirical investigations by providing a controlled environment for simulating human-like strategic interactions with LLM agents. In our pilot case study, the "Water Allocation Challenge," we explore Alympics through a challenging strategic game focused on the multi-round auction on scarce survival resources. This study demonstrates the framework's ability to qualitatively and quantitatively analyze game determinants, strategies, and outcomes. Additionally, we conduct a comprehensive human assessment and an in-depth evaluation of LLM agents in strategic decision-making scenarios. Our findings not only expand the understanding of LLM agents' proficiency in emulating human strategic behavior but also highlight their potential in advancing game theory knowledge, thereby enriching our understanding of both game theory and empowering further research into strategic decision-making domains with LLM agents. Codes, prompts, and all related resources are available at //github.com/microsoft/Alympics.

相關內容

大語言模型

關注 56

大語言模型是基于海量文本數據訓練的深度學習模型。它不僅能夠生成自然語言文本，還能夠深入理解文本含義，處理各種自然語言任務，如文本摘要、問答、翻譯等。2023年，大語言模型及其在人工智能領域的應用已成為全球科技研究的熱點，其在規模上的增長尤為引人注目，參數量已從最初的十幾億躍升到如今的一萬億。參數量的提升使得模型能夠更加精細地捕捉人類語言微妙之處，更加深入地理解人類語言的復雜性。在過去的一年里，大語言模型在吸納新知識、分解復雜任務以及圖文對齊等多方面都有顯著提升。隨著技術的不斷成熟，它將不斷拓展其應用范圍，為人類提供更加智能化和個性化的服務，進一步改善人們的生活和生產方式。

MoDELS · 解碼 · 可辨認的 · 可約的 · INFORMS ·

2024 年 2 月 28 日

IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding

Lanyun Zhu,Deyi Ji,Tianrun Chen,Peng Xu,Jieping Ye,Jun Liu

Despite achieving rapid developments and with widespread applications, Large Vision-Language Models (LVLMs) confront a serious challenge of being prone to generating hallucinations. An over-reliance on linguistic priors has been identified as a key factor leading to these hallucinations. In this paper, we propose to alleviate this problem by introducing a novel image-biased decoding (IBD) technique. Our method derives the next-token probability distribution by contrasting predictions from a conventional LVLM with those of an image-biased LVLM, thereby amplifying the correct information highly correlated with image content while mitigating the hallucinatory errors caused by excessive dependence on text. We further conduct a comprehensive statistical analysis to validate the reliability of our method, and design an adaptive adjustment strategy to achieve robust and flexible handling under varying conditions. Experimental results across multiple evaluation metrics verify that our method, despite not requiring additional training data and only with a minimal increase in model parameters, can significantly reduce hallucinations in LVLMs and enhance the truthfulness of the generated response.

語言模型化 · 分解 · Prompt · MoDELS · 大語言模型 ·

2024 年 2 月 28 日

Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models

Ercong Nie,Shuzhou Yuan,Bolei Ma,Helmut Schmid,Michael F?rber,Frauke Kreuter,Hinrich Schütze

from arxiv, 18 pages, 7 figures

Despite the predominance of English in their training data, English-centric Large Language Models (LLMs) like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks, raising questions about the depth and nature of their cross-lingual capabilities. This paper introduces the decomposed prompting approach to probe the linguistic structure understanding of these LLMs in sequence labeling tasks. Diverging from the single text-to-text prompt, our method generates for each token of the input sentence an individual prompt which asks for its linguistic label. We assess our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages, utilizing both English-centric and multilingual LLMs. Our findings show that decomposed prompting surpasses the iterative prompting baseline in efficacy and efficiency under zero- and few-shot settings. Further analysis reveals the influence of evaluation methods and the use of instructions in prompts. Our multilingual investigation shows that English-centric language models perform better on average than multilingual models. Our study offers insights into the multilingual transferability of English-centric LLMs, contributing to the understanding of their multilingual linguistic knowledge.

同質 · MoDELS · Learning · INTERACT · Extensibility ·

2024 年 2 月 28 日

EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving

Jiacheng Lin,Jiajun Chen,Kunyu Peng,Xuan He,Zhiyong Li,Rainer Stiefelhagen,Kailun Yang

from arxiv, The source code and datasets will be made publicly available at //github.com/lab206/EchoTrack

This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving. Due to the lack of semantic modeling capacity in audio and video, existing works have mainly focused on text-based multi-object tracking, which often comes at the cost of tracking quality, interaction efficiency, and even the safety of assistance systems, limiting the application of such methods in autonomous driving. In this paper, we delve into the problem of AR-MOT from the perspective of audio-video fusion and audio-video tracking. We put forward EchoTrack, an end-to-end AR-MOT framework with dual-stream vision transformers. The dual streams are intertwined with our Bidirectional Frequency-domain Cross-attention Fusion Module (Bi-FCFM), which bidirectionally fuses audio and video features from both frequency- and spatiotemporal domains. Moreover, we propose the Audio-visual Contrastive Tracking Learning (ACTL) regime to extract homogeneous semantic features between expressions and visual objects by learning homogeneous features between different audio and video objects effectively. Aside from the architectural design, we establish the first set of large-scale AR-MOT benchmarks, including Echo-KITTI, Echo-KITTI+, and Echo-BDD. Extensive experiments on the established benchmarks demonstrate the effectiveness of the proposed EchoTrack model and its components. The source code and datasets will be made publicly available at //github.com/lab206/EchoTrack.

Integration · 多樣性 · 步幅 · Performance · Processing（編程語言） ·

2024 年 2 月 28 日

Distributed Intelligent Integrated Sensing and Communications: The 6G-DISAC Approach

Emilio Calvanese Strinati,George C. Alexandropoulos,Madhusudan Giyyarpuram,Philippe Sehier,Sami Mekki,Vincenzo Sciancalepore,Maximilian Stark,Mohamed Sana,Benoit Denis,Maurizio Crozzoli,Navid Amani,Placido Mursia,Raffaele D Errico,Mauro Boldi,Francesca Costanzo,Francois Rivet,Henk Wymeerschx

from arxiv, Submitted for conference publication

This paper introduces the concept of Distributed Intelligent integrated Sensing and Communications (DISAC), which expands the capabilities of Integrated Sensing and Communications (ISAC) towards distributed architectures. Additionally, the DISAC framework integrates novel waveform design with new semantic and goal-oriented communication paradigms, enabling ISAC technologies to transition from traditional data fusion to the semantic composition of diverse sensed and shared information. This progress facilitates large-scale, energy-efficient support for high-precision spatial-temporal processing, optimizing ISAC resource utilization, and enabling effective multi-modal sensing performance. Addressing key challenges such as efficient data management and connect-compute resource utilization, 6G- DISAC stands to revolutionize applications in diverse sectors including transportation, healthcare, and industrial automation. Our study encapsulates the project vision, methodologies, and potential impact, marking a significant stride towards a more connected and intelligent world.

Extensibility · Automator · Integration · 相互獨立的 · data integrity ·

2024 年 2 月 27 日

SmartQC: An Extensible DLT-Based Framework for Trusted Data Workflows in Smart Manufacturing

Alan McGibney,Tharindu Ranathunga,Roman Pospisil

from arxiv, 33 Pages, 9 Figures, Under Peer Review Process

Recent developments in Distributed Ledger Technology (DLT), including Blockchain offer new opportunities in the manufacturing domain, by providing mechanisms to automate trust services (digital identity, trusted interactions, and auditable transactions) and when combined with other advanced digital technologies (e.g. machine learning) can provide a secure backbone for trusted data flows between independent entities. This paper presents an DLT-based architectural pattern and technology solution known as SmartQC that aims to provide an extensible and flexible approach to integrating DLT technology into existing workflows and processes. SmartQC offers an opportunity to make processes more time efficient, reliable, and robust by providing two key features i) data integrity through immutable ledgers and ii) automation of business workflows leveraging smart contracts. The paper will present the system architecture, extensible data model and the application of SmartQC in the context of example smart manufacturing applications.

Analysis · 無監督 · Backbone · 設計 · 圖片分類 ·

2024 年 2 月 25 日

Key Design Choices in Source-Free Unsupervised Domain Adaptation: An In-depth Empirical Analysis

Andrea Maracani,Raffaello Camoriano,Elisa Maiettini,Davide Talon,Lorenzo Rosasco,Lorenzo Natale

This study provides a comprehensive benchmark framework for Source-Free Unsupervised Domain Adaptation (SF-UDA) in image classification, aiming to achieve a rigorous empirical understanding of the complex relationships between multiple key design factors in SF-UDA methods. The study empirically examines a diverse set of SF-UDA techniques, assessing their consistency across datasets, sensitivity to specific hyperparameters, and applicability across different families of backbone architectures. Moreover, it exhaustively evaluates pre-training datasets and strategies, particularly focusing on both supervised and self-supervised methods, as well as the impact of fine-tuning on the source domain. Our analysis also highlights gaps in existing benchmark practices, guiding SF-UDA research towards more effective and general approaches. It emphasizes the importance of backbone architecture and pre-training dataset selection on SF-UDA performance, serving as an essential reference and providing key insights. Lastly, we release the source code of our experimental framework. This facilitates the construction, training, and testing of SF-UDA methods, enabling systematic large-scale experimental analysis and supporting further research efforts in this field.

大語言模型 · 語言模型化 · 自動問答 · MoDELS · 得分 ·

2024 年 2 月 25 日

EHRNoteQA: A Patient-Specific Question Answering Benchmark for Evaluating Large Language Models in Clinical Settings

Sunjun Kweon,Jiyoun Kim,Heeyoung Kwak,Dongchul Cha,Hangyul Yoon,Kwanghyun Kim,Seunghyun Won,Edward Choi

from arxiv, Under Review

This study introduces EHRNoteQA, a novel patient-specific question answering benchmark tailored for evaluating Large Language Models (LLMs) in clinical environments. Based on MIMIC-IV Electronic Health Record (EHR), a team of three medical professionals has curated the dataset comprising 962 unique questions, each linked to a specific patient's EHR clinical notes. What makes EHRNoteQA distinct from existing EHR-based benchmarks is as follows: Firstly, it is the first dataset to adopt a multi-choice question answering format, a design choice that effectively evaluates LLMs with reliable scores in the context of automatic evaluation, compared to other formats. Secondly, it requires an analysis of multiple clinical notes to answer a single question, reflecting the complex nature of real-world clinical decision-making where clinicians review extensive records of patient histories. Our comprehensive evaluation on various large language models showed that their scores on EHRNoteQA correlate more closely with their performance in addressing real-world medical questions evaluated by clinicians than their scores from other LLM benchmarks. This underscores the significance of EHRNoteQA in evaluating LLMs for medical applications and highlights its crucial role in facilitating the integration of LLMs into healthcare systems. The dataset will be made available to the public under PhysioNet credential access, promoting further research in this vital field.

GANs · 數據集 · KDD · Attention · MoDELS ·

2024 年 2 月 25 日

Attention-GAN for Anomaly Detection: A Cutting-Edge Approach to Cybersecurity Threat Management

Mohammed Abo Sen

This paper proposes an innovative Attention-GAN framework for enhancing cybersecurity, focusing on anomaly detection. In response to the challenges posed by the constantly evolving nature of cyber threats, the proposed approach aims to generate diverse and realistic synthetic attack scenarios, thereby enriching the dataset and improving threat identification. Integrating attention mechanisms with Generative Adversarial Networks (GANs) is a key feature of the proposed method. The attention mechanism enhances the model's ability to focus on relevant features, essential for detecting subtle and complex attack patterns. In addition, GANs address the issue of data scarcity by generating additional varied attack data, encompassing known and emerging threats. This dual approach ensures that the system remains relevant and effective against the continuously evolving cyberattacks. The KDD Cup and CICIDS2017 datasets were used to validate this model, which exhibited significant improvements in anomaly detection. It achieved an accuracy of 99.69% on the KDD dataset and 97.93% on the CICIDS2017 dataset, with precision, recall, and F1-scores above 97%, demonstrating its effectiveness in recognizing complex attack patterns. This study contributes significantly to cybersecurity by providing a scalable and adaptable solution for anomaly detection in the face of sophisticated and dynamic cyber threats. The exploration of GANs for data augmentation highlights a promising direction for future research, particularly in situations where data limitations restrict the development of cybersecurity systems. The attention-GAN framework has emerged as a pioneering approach, setting a new benchmark for advanced cyber-defense strategies.

SPIN · 優化器 · 變換 · 模型評估 · Better ·

2024 年 2 月 24 日

Spin: An Efficient Secure Computation Framework with GPU Acceleration

Wuxuan Jiang,Xiangjun Song,Shenbai Hong,Haijun Zhang,Wenxin Liu,Bo Zhao,Wei Xu,Yi Li

Accuracy and efficiency remain challenges for multi-party computation (MPC) frameworks. Spin is a GPU-accelerated MPC framework that supports multiple computation parties and a dishonest majority adversarial setup. We propose optimized protocols for non-linear functions that are critical for machine learning, as well as several novel optimizations specific to attention that is the fundamental unit of Transformer models, allowing Spin to perform non-trivial CNNs training and Transformer inference without sacrificing security. At the backend level, Spin leverages GPU, CPU, and RDMA-enabled smart network cards for acceleration. Comprehensive evaluations demonstrate that Spin can be up to $2\times$ faster than the state-of-the-art for deep neural network training. For inference on a Transformer model with 18.9 million parameters, our attention-specific optimizations enable Spin to achieve better efficiency, less communication, and better accuracy.

線性的 · 泛函 · 近似 · 穩健性 · 源領域 ·

2024 年 2 月 23 日

Distributionally Robust Off-Dynamics Reinforcement Learning: Provable Efficiency with Linear Function Approximation

Zhishuai Liu,Pan Xu

from arxiv, 30 pages, 4 figures. To appear in the proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS)

We study off-dynamics Reinforcement Learning (RL), where the policy is trained on a source domain and deployed to a distinct target domain. We aim to solve this problem via online distributionally robust Markov decision processes (DRMDPs), where the learning algorithm actively interacts with the source domain while seeking the optimal performance under the worst possible dynamics that is within an uncertainty set of the source domain's transition kernel. We provide the first study on online DRMDPs with function approximation for off-dynamics RL. We find that DRMDPs' dual formulation can induce nonlinearity, even when the nominal transition kernel is linear, leading to error propagation. By designing a $d$-rectangular uncertainty set using the total variation distance, we remove this additional nonlinearity and bypass the error propagation. We then introduce DR-LSVI-UCB, the first provably efficient online DRMDP algorithm for off-dynamics RL with function approximation, and establish a polynomial suboptimality bound that is independent of the state and action space sizes. Our work makes the first step towards a deeper understanding of the provable efficiency of online DRMDPs with linear function approximation. Finally, we substantiate the performance and robustness of DR-LSVI-UCB through different numerical experiments.