爱琴海论坛视频播放三免费,欧美狂野视频一区国产精品,尤物视频一区二区,日本不卡一区二区三区视频,资源视频在线观看

The proliferation of online offensive language necessitates the development of effective detection mechanisms, especially in multilingual contexts. This study addresses the challenge by developing and introducing novel datasets for offensive language detection in three major Nigerian languages: Hausa, Yoruba, and Igbo. We collected data from Twitter and manually annotated it to create datasets for each of the three languages, using native speakers. We used pre-trained language models to evaluate their efficacy in detecting offensive language in our datasets. The best-performing model achieved an accuracy of 90\%. To further support research in offensive language detection, we plan to make the dataset and our models publicly available.

相關內容

數(shu)據集(ji)

關注 88

數據集，又稱為資料集、數據集合或資料集合，是一種由數據所組成的集合。
Data set（或dataset）是一個數據的集合，通常以表格形式出現。每一列代表一個特定變量。每一行都對應于某一成員的數據集的問題。它列出的價值觀為每一個變量，如身高和體重的一個物體或價值的隨機數。每個數值被稱為數據資料。對應于行數，該數據集的數據可能包括一個或多個成員。

統計量 · INTERACT · Learning · 估計/估計量 · 可理解性 ·

2024 年 7 月 15 日

A unified theory and statistical learning approach for traffic conflict detection

Yiru Jiao,Simeon C. Calvert,Sander van Cranenburgh,Hans van Lint

from arxiv, 21 pages, 9 figures, prepared for submission

This study proposes a unified theory and statistical learning approach for traffic conflict detection, addressing the long-existing call for a consistent and comprehensive methodology to evaluate the collision risk emerged in road user interactions. The proposed theory assumes a context-dependent probabilistic collision risk and frames conflict detection as estimating the risk by statistical learning from observed proximities and contextual variables. Three primary tasks are integrated: representing interaction context from selected observables, inferring proximity distributions in different contexts, and applying extreme value theory to relate conflict intensity with conflict probability. As a result, this methodology is adaptable to various road users and interaction scenarios, enhancing its applicability without the need for pre-labelled conflict data. Demonstration experiments are executed using real-world trajectory data, with the unified metric trained on lane-changing interactions on German highways and applied to near-crash events from the 100-Car Naturalistic Driving Study in the U.S. The experiments demonstrate the methodology's ability to provide effective collision warnings, generalise across different datasets and traffic environments, cover a broad range of conflicts, and deliver a long-tailed distribution of conflict intensity. This study contributes to traffic safety by offering a consistent and explainable methodology for conflict detection applicable across various scenarios. Its societal implications include enhanced safety evaluations of traffic infrastructures, more effective collision warning systems for autonomous and driving assistance systems, and a deeper understanding of road user behaviour in different traffic conditions, contributing to a potential reduction in accident rates and improving overall traffic safety.

Siamese · Networking · Neural Networks · 回合 · 機器人 ·

2024 年 7 月 15 日

An experimental evaluation of Siamese Neural Networks for robot localization using omnidirectional imaging in indoor environments

J. J. Cabrera,V. Román,A. Gil,O. Reinoso,L. Payá

from arxiv, Published: 08 July 2024 Paper link: //link.springer.com/content/pdf/10.1007/s10462-024-10840-0.pdf

The objective of this paper is to address the localization problem using omnidirectional images captured by a catadioptric vision system mounted on the robot. For this purpose, we explore the potential of Siamese Neural Networks for modeling indoor environments using panoramic images as the unique source of information. Siamese Neural Networks are characterized by their ability to generate a similarity function between two input data, in this case, between two panoramic images. In this study, Siamese Neural Networks composed of two Convolutional Neural Networks (CNNs) are used. The output of each CNN is a descriptor which is used to characterize each image. The dissimilarity of the images is computed by measuring the distance between these descriptors. This fact makes Siamese Neural Networks particularly suitable to perform image retrieval tasks. First, we evaluate an initial task strongly related to localization that consists in detecting whether two images have been captured in the same or in different rooms. Next, we assess Siamese Neural Networks in the context of a global localization problem. The results outperform previous techniques for solving the localization task using the COLD-Freiburg dataset, in a variety of lighting conditions, specially when using images captured in cloudy and night conditions.

無限 · 線性的 · 優化器 · Extensibility · CASE ·

2024 年 7 月 13 日

Infinite families of optimal and minimal codes over rings using simplicial complexes

Yanan Wu,Tingting Pang,Nian Li,Yanbin Pan,Xiangyong Zeng

from arxiv, 26 pages

In this paper, several infinite families of codes over the extension of non-unital non-commutative rings are constructed utilizing general simplicial complexes. Thanks to the special structure of the defining sets, the principal parameters of these codes are characterized. Specially, when the employed simplicial complexes are generated by a single maximal element, we determine their Lee weight distributions completely. Furthermore, by considering the Gray image codes and the corresponding subfield-like codes, numerous of linear codes over $\mathbb{F}_q$ are also obtained, where $q$ is a prime power. Certain conditions are given to ensure the above linear codes are (Hermitian) self-orthogonal in the case of $q=2,3,4$. It is noteworthy that most of the derived codes over $\mathbb{F}_q$ satisfy the Ashikhmin-Barg's condition for minimality. Besides, we obtain two infinite families of distance-optimal codes over $\mathbb{F}_q$ with respect to the Griesmer bound.

知識 (knowledge) · 鏈路預測 · 圖 · Weight · Networking ·

2024 年 7 月 12 日

CausalLP: Learning causal relations with weighted knowledge graph link prediction

Utkarshani Jaimini,Cory Henson,Amit P. Sheth

from arxiv, 9 pages, 8 figures

Causal networks are useful in a wide variety of applications, from medical diagnosis to root-cause analysis in manufacturing. In practice, however, causal networks are often incomplete with missing causal relations. This paper presents a novel approach, called CausalLP, that formulates the issue of incomplete causal networks as a knowledge graph completion problem. More specifically, the task of finding new causal relations in an incomplete causal network is mapped to the task of knowledge graph link prediction. The use of knowledge graphs to represent causal relations enables the integration of external domain knowledge; and as an added complexity, the causal relations have weights representing the strength of the causal association between entities in the knowledge graph. Two primary tasks are supported by CausalLP: causal explanation and causal prediction. An evaluation of this approach uses a benchmark dataset of simulated videos for causal reasoning, CLEVRER-Humans, and compares the performance of multiple knowledge graph embedding algorithms. Two distinct dataset splitting approaches are used for evaluation: (1) random-based split, which is the method typically employed to evaluate link prediction algorithms, and (2) Markov-based split, a novel data split technique that utilizes the Markovian property of causal relations. Results show that using weighted causal relations improves causal link prediction over the baseline without weighted relations.

可辨認的 · Less · 可約的 · Principle · 離散化 ·

2024 年 7 月 11 日

New limiter regions for multidimensional flows

James Woodfield,Hilary Weller,Colin J Cotter

Accurate transport algorithms are crucial for computational fluid dynamics and more accurate and efficient schemes are always in development. One dimensional limiting is commonly employed to suppress nonphysical oscillations. However, the application of such limiters can reduce accuracy. It is important to identify the weakest set of sufficient conditions required on the limiter as to allow the development of successful numerical algorithms. The main goal of this paper is to identify new less restrictive sufficient conditions for flux form in-compressible advection to remain monotonic. We identify additional necessary conditions for incompressible flux form advection to be monotonic, demonstrating that the Spekreijse limiter region is not sufficient for incompressible flux form advection to remain monotonic. Then a convex combination argument is used to derive new sufficient conditions that are less restrictive than the Sweby region for a discrete maximum principle. This allows the introduction of two new more general limiter regions suitable for flux form incompressible advection.

Networking · INFORMS · Feel · 設計 · 多樣性 ·

2024 年 7 月 11 日

Authenticity and exclusion: social media recommendation algorithms and the dynamics of belonging in professional networks

Nil-Jana Akpinar,Sina Fazelpour

Homophily - the attraction of similarity - profoundly influences social interactions, affecting associations, information disclosure, and the dynamics of social exchanges. Organizational studies reveal that when professional and personal boundaries overlap, individuals from minority backgrounds often encounter a dilemma between authenticity and inclusion due to these homophily-driven dynamics: if they disclose their genuine interests, they risk exclusion from the broader conversation. Conversely, to gain inclusion, they might feel pressured to assimilate. How might the nature and design of social media platforms, where different conversational contexts frequently collapse, and the recommender algorithms that are at the heart of these platforms, which can prioritize content based on network structure and historical user engagement, impact these dynamics? In this paper, we employ agent-based simulations to investigate this question. Our findings indicate a decline in the visibility of professional content generated by minority groups, a trend that is exacerbated over time by recommendation algorithms. Within these minority communities, users who closely resemble the majority group tend to receive greater visibility. We examine the philosophical and design implications of our results, discussing their relevance to questions of informational justice, inclusion, and the epistemic benefits of diversity.

生成式人工智能 · MoDELS · 設計 · AI · Automator ·

2024 年 7 月 11 日

Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation

Kaiyan Chang,Zhirong Chen,Yunhao Zhou,Wenlong Zhu,kun wang,Haobo Xu,Cangyuan Li,Mengdi Wang,Shengwen Liang,Huawei Li,Yinhe Han,Ying Wang

from arxiv, Accepted by ICCAD 2024

Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing spatial complexity, potentially surpassing the efficacy of natural-language-only inputs. Expanding upon this premise, our paper introduces an open-source benchmark for multi-modal generative models tailored for Verilog synthesis from visual-linguistic inputs, addressing both singular and complex modules. Additionally, we introduce an open-source visual and natural language Verilog query language framework to facilitate efficient and user-friendly multi-modal queries. To evaluate the performance of the proposed multi-modal hardware generative AI in Verilog generation tasks, we compare it with a popular method that relies solely on natural language. Our results demonstrate a significant accuracy improvement in the multi-modal generated Verilog compared to queries based solely on natural language. We hope to reveal a new approach to hardware design in the large-hardware-design-model era, thereby fostering a more diversified and productive approach to hardware design.

大語言模型 · 語言模型化 · 置信度 · MoDELS · 人工智能 ·

2024 年 7 月 11 日

On the attribution of confidence to large language models

Geoff Keeling,Winnie Street

from arxiv, 22 pages, 0 figures

Credences are mental states corresponding to degrees of confidence in propositions. Attribution of credences to Large Language Models (LLMs) is commonplace in the empirical literature on LLM evaluation. Yet the theoretical basis for LLM credence attribution is unclear. We defend three claims. First, our semantic claim is that LLM credence attributions are (at least in general) correctly interpreted literally, as expressing truth-apt beliefs on the part of scientists that purport to describe facts about LLM credences. Second, our metaphysical claim is that the existence of LLM credences is at least plausible, although current evidence is inconclusive. Third, our epistemic claim is that LLM credence attributions made in the empirical literature on LLM evaluation are subject to non-trivial sceptical concerns. It is a distinct possibility that even if LLMs have credences, LLM credence attributions are generally false because the experimental techniques used to assess LLM credences are not truth-tracking.

state-of-the-art · Processing（編程語言） · Analysis · 語言處理 · MoDELS ·

2024 年 7 月 11 日

LLMs' morphological analyses of complex FST-generated Finnish words

Anssi Moisio,Mathias Creutz,Mikko Kurimo

from arxiv, To appear at the CMCL Workshop at ACL 2024

Rule-based language processing systems have been overshadowed by neural systems in terms of utility, but it remains unclear whether neural NLP systems, in practice, learn the grammar rules that humans use. This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms. We generate the forms using an FST tool, and they are unlikely to have occurred in the training sets of the LLMs, therefore requiring morphological generalisation capacity. We find that GPT-4-turbo has some difficulties in the task while GPT-3.5-turbo struggles and smaller models Llama2-70B and Poro-34B fail nearly completely.

entity · 命名實體識別 · CASE · Extensibility · MoDELS ·

2019 年 11 月 14 日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Qianhui Wu,Zijia Lin,Guoxin Wang,Hui Chen,B?rje F. Karlsson,Biqing Huang,Chin-Yew Lin

from arxiv, This paper is accepted by AAAI2020

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.