亚洲精品无码国产爽快A片百度_国内精品VA视频在线观看_免费中文字幕午夜理论_亚洲色精品一区二区色欲AV_日韩欧美中文字幕在线播放_欧美综合自拍亚洲综合区精品_欧美一级视频正版免费播放

We introduce the Dutch Model Benchmark: DUMB. The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks. The total set of nine tasks includes four tasks that were previously not available in Dutch. Instead of relying on a mean score across tasks, we propose Relative Error Reduction (RER), which compares the DUMB performance of language models to a strong baseline which can be referred to in the future even when assessing different sets of language models. Through a comparison of 14 pre-trained language models (mono- and multi-lingual, of varying sizes), we assess the internal consistency of the benchmark tasks, as well as the factors that likely enable high performance. Our results indicate that current Dutch monolingual models under-perform and suggest training larger Dutch models with other architectures and pre-training objectives. At present, the highest performance is achieved by DeBERTaV3 (large), XLM-R (large) and mDeBERTaV3 (base). In addition to highlighting best strategies for training larger Dutch models, DUMB will foster further research on Dutch. A public leaderboard is available at //dumbench.nl.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Performer · 評論員 · Analysis · 泛函 ·

2023 年 11 月 30 日

Scalable and Lightweight Post-Quantum Authentication for Internet of Things

Attila A. Yavuz,Saleh Darzi,Saif E. Nouma

from arxiv, 9 pages

Internet of Things (IoT) applications are composed of massive quantities of resource-limited devices that collect sensitive data with long-term operational and security requirements. With the threat of emerging quantum computers, Post-Quantum Cryptography (PQC) is a critical requirement for IoTs. In particular, digital signatures offer scalable authentication with non-repudiation and are an essential tool for IoTs. However, as seen in NIST PQC standardization, post-quantum signatures are extremely costly for resource-limited IoTs. Hence, there is a significant need for quantum-safe signatures that respect the processing, memory, and bandwidth limitations of IoTs. In this paper, we created a new lightweight quantum-safe digital signature referred to as INFinity-HORS (INF-HORS), which is (to the best of our knowledge) the first signer-optimal hash-based signature with (polynomially) unbounded signing capability. INF-HORS enables a verifier to non-interactively construct one-time public keys from a master public key via encrypted function evaluations. This strategy avoids the performance bottleneck of hash-based standards (e.g., SPHINCS+) by eliminating hyper-tree structures. It also does not require a trusted party or non-colliding servers to distribute public keys. Our performance analysis confirms that INF-HORS is magnitudes of times more signer computation efficient than selected NIST PQC schemes (e.g., SPHINCS+, Dilithium, Falcon) with a small memory footprint.

Integration · AI · INFORMS · MoDELS · 數據集 ·

2023 年 11 月 30 日

Navigating Privacy and Copyright Challenges Across the Data Lifecycle of Generative AI

Dawen Zhang,Boming Xia,Yue Liu,Xiwei Xu,Thong Hoang,Zhenchang Xing,Mark Staples,Qinghua Lu,Liming Zhu

The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential privacy, machine unlearning, and data poisoning only offer fragmented solutions to these complex issues. Our paper delves into the multifaceted challenges of privacy and copyright protection within the data lifecycle. We advocate for integrated approaches that combines technical innovation with ethical foresight, holistically addressing these concerns by investigating and devising solutions that are informed by the lifecycle perspective. This work aims to catalyze a broader discussion and inspire concerted efforts towards data privacy and copyright integrity in Generative AI.

語言模型化 · MoDELS · tuning · INTERACT · 數據集 ·

2023 年 11 月 30 日

Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models

Sungjoo Byun,Dongjun Jang,Hyemi Jo,Hyopil Shin

from arxiv, NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following

Caution: this paper may include material that could be offensive or distressing. The advent of Large Language Models (LLMs) necessitates the development of training approaches that mitigate the generation of unethical language and aptly manage toxic user queries. Given the challenges related to human labor and the scarcity of data, we present KoTox, comprising 39K unethical instruction-output pairs. This collection of automatically generated toxic instructions refines the training of LLMs and establishes a foundational framework for improving LLMs' ethical awareness and response to various toxic inputs, promoting more secure and responsible interactions in Natural Language Processing (NLP) applications.

可理解性 · MoDELS · 數據集 · Taxonomy · 相關系數 ·

2023 年 11 月 29 日

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Shicheng Li,Lei Li,Shuhuai Ren,Yuanxin Liu,Yi Liu,Rundong Gao,Xu Sun,Lu Hou

from arxiv, 23 pages, 6 figures, 18 tables, data is available at //github.com/lscpku/VITATECS

The ability to perceive how objects change over time is a crucial ingredient in human intelligence. However, current benchmarks cannot faithfully reflect the temporal understanding abilities of video-language models (VidLMs) due to the existence of static visual shortcuts. To remedy this issue, we present VITATECS, a diagnostic VIdeo-Text dAtaset for the evaluation of TEmporal Concept underStanding. Specifically, we first introduce a fine-grained taxonomy of temporal concepts in natural language in order to diagnose the capability of VidLMs to comprehend different temporal aspects. Furthermore, to disentangle the correlation between static and temporal information, we generate counterfactual video descriptions that differ from the original one only in the specified temporal aspect. We employ a semi-automatic data collection framework using large language models and human-in-the-loop annotation to obtain high-quality counterfactual descriptions efficiently. Evaluation of representative video-language understanding models confirms their deficiency in temporal understanding, revealing the need for greater emphasis on the temporal elements in video-language research.

級聯 · 可約的 · 高通量 · 優化器 · 邊 ·

2023 年 11 月 29 日

Cascade: A Platform for Delay-Sensitive Edge Intelligence

Weijia Song,Thiago Garrett,Yuting Yang,Mingzhao Liu,Edward Tremel,Lorenzo Rosa,Andrea Merlina,Roman Vitenberg,Ken Birman

from arxiv, 14 pages, 12 Figures

Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to untangle this puzzle. Innovations include a legacy-friendly storage layer that moves data with minimal copying and a "fast path" that collocates data and computation to maximize responsiveness. Our evaluation shows that Cascade reduces latency by orders of magnitude with no loss of throughput.

MoDELS · GPT-4V · 可理解性 · Analysis · 模型評估 ·

2023 年 11 月 28 日

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination & Visual Illusion in Large Vision-Language Models

Tianrui Guan,Fuxiao Liu,Xiyang Wu,Ruiqi Xian,Zongxia Li,Xiaoyu Liu,Xijun Wang,Lichang Chen,Furong Huang,Yaser Yacoob,Dinesh Manocha,Tianyi Zhou

We introduce HallusionBench, a comprehensive benchmark designed for the evaluation of image-context reasoning. This benchmark presents significant challenges to advanced large visual-language models (LVLMs), such as GPT-4V(Vision) and LLaVA-1.5, by emphasizing nuanced understanding and interpretation of visual data. The benchmark comprises 346 images paired with 1129 questions, all meticulously crafted by human experts. We introduce a novel structure for these visual questions designed to establish control groups. This structure enables us to conduct a quantitative analysis of the models' response tendencies, logical consistency, and various failure modes. In our evaluation on HallusionBench, we benchmarked 13 different models, highlighting a 31.42% question-pair accuracy achieved by the state-of-the-art GPT-4V. Notably, all other evaluated models achieve accuracy below 16%. Moreover, our analysis not only highlights the observed failure modes, including language hallucination and visual illusion, but also deepens an understanding of these pitfalls. Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs. Based on these insights, we suggest potential pathways for their future improvement. The benchmark and codebase can be accessed at //github.com/tianyi-lab/HallusionBench.

可約的 · Performer · 優化器 · FAST · Analysis ·

2023 年 11 月 28 日

On the Analysis and Optimization of Fast Conditional Handover with Hand Blockage for Mobility

Subhyal Bin Iqbal,Salman Nadaf,Ahmad Awada,Umur Karabulut,Philipp Schulz,Gerhard P. Fettweis

from arxiv, Accepted IEEE Access journal publication

Although frequency range 2 (FR2) systems are an essential part of 5G-Advanced and future 3GPP releases, the mobility performance of multi-panel user equipment (MPUE) with hand blockage is still an area open for research and standardization. In this article, a comprehensive study on the mobility performance of MPUE with hand blockage is performed for conditional handover (CHO) and its potential enhancement denoted by fast conditional handover (FCHO). In contrast to CHO, in FCHO the MPUE can reuse earlier target cell preparations after each handover to autonomously execute subsequent handovers. This saves both the signaling overhead associated with the reconfiguration and re-preparation of target cells after each handover and reduces mobility failures. Results have shown that FCHO offers considerable mobility performance gains as compared to CHO for different hand blockage cases that are dependent on the hand position around the MPUE. For the worst-case hand blockage scenario, it is seen that mobility failures reduce by 10.5% and 19.3% for the 60 km/h and 120 km/h mobility scenarios, respectively. This gain comes at the expense of reserving the handover resources of an MPUE for a longer time given that the target cell configurations are not necessarily released after each handover. In this article, the longer resource reservation problem in FCHO is analysed and three different resource reservation optimization techniques are introduced. Results have shown that these optimization techniques not only reduce the resource reservation time but also significantly reduce the signaling overhead at the possible expense of a tolerable degradation in mobility performance.

語言模型化 · INFORMS · 向量化 · MoDELS · Processing（編程語言） ·

2023 年 11 月 27 日

Applications of Large Language Models in Data Processing: Innovative Approaches to Segmenting and Renewing Information

Yu-Chen Lin,Akhilesh Kumar,Wen-Liang Zhang,Norman Chang,Muhammad Zakir,Rucha Apte,Chao Wang,Jyh-Shing Roger Jang

Our paper investigates effective methods for code generation in "specific-domain" applications, including the use of Large Language Models (LLMs) for data segmentation and renewal, as well as stimulating deeper thinking in LLMs through prompt adjustments. Using a real company product as an example, we provide user manuals, API documentation, and other data. The ideas discussed in this paper help segment and then convert this data into semantic vectors to better reflect their true positioning. Subsequently, user requirements are transformed into vectors to retrieve the most relevant content, achieving about 70% accuracy in simple to medium-complexity tasks through various prompt techniques. This paper is the first to enhance specific-domain code generation effectiveness from this perspective. Additionally, we experiment with generating more scripts from a limited number using llama2-based fine-tuning to test its effectiveness in professional domain code generation. This is a challenging and promising field, and once achieved, it will not only lead to breakthroughs in LLM development across multiple industries but also enable LLMs to understand and learn any new knowledge effectively.

知識 (knowledge) · Learning · MoDELS · 圖 · entity ·

2022 年 11 月 29 日

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Yuanning Cui,Yuxin Wang,Zequn Sun,Wenqiang Liu,Yiqiao Jiang,Kexin Han,Wei Hu

from arxiv, Accepted in the 37th AAAI Conference on Artificial Intelligence (AAAI 2023)

Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.