亚洲成AV人片乱码色午夜刚交_日本成年黄色一区二区三区_嫩草91香蕉国产观看免费_欧美高潮喷水视频在线观看_95视频手机在线_精品一区二区三区在线视频日木_中文字幕精品动漫一区

In the context of content-based recommender systems, the aim of this paper is to determine how better profiles can be built and how these affect the recommendation process based on the incorporation of temporality, i.e. the inclusion of time in the recommendation process, and topicality, i.e. the representation of texts associated with users and items using topics and their combination. The main contribution of the paper is to present two different ways of hybridising these two dimensions and to evaluate and compare them with other alternatives.

相關內容

Processing（編程語言）

關注 121

Processing 是一門開(kai)源編(bian)程(cheng)語言和與之(zhi)配套的集成開(kai)發(fa)環境(jing)（IDE）的名(ming)稱。Processing 在電子藝術和視(shi)覺設計社區(qu)被用(yong)來教授編(bian)程(cheng)基礎，并(bing)運用(yong)于大量的新媒體(ti)和互動藝術作(zuo)品(pin)中(zhong)。

情感分析 · Analysis · CASES · 數據集 · Integration ·

2024 年 3 月 2 日

A comprehensive cross-language framework for harmful content detection with the aid of sentiment analysis

Mohammad Dehghani

In today's digital world, social media plays a significant role in facilitating communication and content sharing. However, the exponential rise in user-generated content has led to challenges in maintaining a respectful online environment. In some cases, users have taken advantage of anonymity in order to use harmful language, which can negatively affect the user experience and pose serious social problems. Recognizing the limitations of manual moderation, automatic detection systems have been developed to tackle this problem. Nevertheless, several obstacles persist, including the absence of a universal definition for harmful language, inadequate datasets across languages, the need for detailed annotation guideline, and most importantly, a comprehensive framework. This study aims to address these challenges by introducing, for the first time, a detailed framework adaptable to any language. This framework encompasses various aspects of harmful language detection. A key component of the framework is the development of a general and detailed annotation guideline. Additionally, the integration of sentiment analysis represents a novel approach to enhancing harmful language detection. Also, a definition of harmful language based on the review of different related concepts is presented. To demonstrate the effectiveness of the proposed framework, its implementation in a challenging low-resource language is conducted. We collected a Persian dataset and applied the annotation guideline for harmful detection and sentiment analysis. Next, we present baseline experiments utilizing machine and deep learning methods to set benchmarks. Results prove the framework's high performance, achieving an accuracy of 99.4% in offensive language detection and 66.2% in sentiment analysis.

假陰性 · 大語言模型 · MoDELS · 語言模型化 · FAST ·

2024 年 3 月 1 日

AtP*: An efficient and scalable method for localizing LLM behaviour to components

János Kramár,Tom Lieberum,Rohin Shah,Neel Nanda

Activation Patching is a method of directly computing causal attributions of behavior to model components. However, applying it exhaustively requires a sweep with cost scaling linearly in the number of model components, which can be prohibitively expensive for SoTA Large Language Models (LLMs). We investigate Attribution Patching (AtP), a fast gradient-based approximation to Activation Patching and find two classes of failure modes of AtP which lead to significant false negatives. We propose a variant of AtP called AtP*, with two changes to address these failure modes while retaining scalability. We present the first systematic study of AtP and alternative methods for faster activation patching and show that AtP significantly outperforms all other investigated methods, with AtP* providing further significant improvement. Finally, we provide a method to bound the probability of remaining false negatives of AtP* estimates.

語言模型化 · MoDELS · 有偏 · INTERACT · INFORMS ·

2024 年 3 月 1 日

Dialect prejudice predicts AI decisions about people's character, employability, and criminality

Valentin Hofmann,Pratyusha Ria Kalluri,Dan Jurafsky,Sharese King

Hundreds of millions of people now interact with language models, with uses ranging from serving as a writing aid to informing hiring decisions. Yet these language models are known to perpetuate systematic racial prejudices, making their judgments biased in problematic ways about groups like African Americans. While prior research has focused on overt racism in language models, social scientists have argued that racism with a more subtle character has developed over time. It is unknown whether this covert racism manifests in language models. Here, we demonstrate that language models embody covert racism in the form of dialect prejudice: we extend research showing that Americans hold raciolinguistic stereotypes about speakers of African American English and find that language models have the same prejudice, exhibiting covert stereotypes that are more negative than any human stereotypes about African Americans ever experimentally recorded, although closest to the ones from before the civil rights movement. By contrast, the language models' overt stereotypes about African Americans are much more positive. We demonstrate that dialect prejudice has the potential for harmful consequences by asking language models to make hypothetical decisions about people, based only on how they speak. Language models are more likely to suggest that speakers of African American English be assigned less prestigious jobs, be convicted of crimes, and be sentenced to death. Finally, we show that existing methods for alleviating racial bias in language models such as human feedback training do not mitigate the dialect prejudice, but can exacerbate the discrepancy between covert and overt stereotypes, by teaching language models to superficially conceal the racism that they maintain on a deeper level. Our findings have far-reaching implications for the fair and safe employment of language technology.

連結 · 泛函 · 泛化理論 · 樣例 · 相似度 ·

2024 年 3 月 1 日

Introducing locality in some generalized AG codes

Bastien Pacifico

from arxiv, 18 pages

In 1999, Xing, Niederreiter and Lam introduced a generalization of AG codes using the evaluation at non-rational places of a function field. In this paper, we show that one can obtain a locality parameter $r$ in such codes by using only non-rational places of degrees at most $r$. This is, up to the author's knowledge, a new way to construct locally recoverable codes (LRCs). We give an example of such a code reaching the Singleton-like bound for LRCs, and show the parameters obtained for some longer codes over $\mathbb F_3$. We then investigate similarities with certain concatenated codes. Contrary to previous methods, our construction allows one to obtain directly codes whose dimension is not a multiple of the locality. Finally, we give an asymptotic study using the Garcia-Stichtenoth tower of function fields, for both our construction and a construction of concatenated codes. We give explicit infinite families of LRCs with locality 2 over any finite field of cardinality greater than 3 following our new approach.

詞表 · 列 · 控制器 · 文本分類 · INFORMS ·

2024 年 3 月 1 日

Text classification of column headers with a controlled vocabulary: leveraging LLMs for metadata enrichment

Margherita Martorana,Tobias Kuhn,Lise Stork,Jacco van Ossenbruggen

Traditional dataset retrieval systems index on metadata information rather than on the data values. Thus relying primarily on manual annotations and high-quality metadata, processes known to be labour-intensive and challenging to automate. We propose a method to support metadata enrichment with topic annotations of column headers using three Large Language Models (LLMs): ChatGPT-3.5, GoogleBard and GoogleGemini. We investigate the LLMs ability to classify column headers based on domain-specific topics from a controlled vocabulary. We evaluate our approach by assessing the internal consistency of the LLMs, the inter-machine alignment, and the human-machine agreement for the topic classification task. Additionally, we investigate the impact of contextual information (i.e. dataset description) on the classification outcomes. Our results suggest that ChatGPT and GoogleGemini outperform GoogleBard for internal consistency as well as LLM-human-alignment. Interestingly, we found that context had no impact on the LLMs performances. This work proposes a novel approach that leverages LLMs for text classification using a controlled topic vocabulary, which has the potential to facilitate automated metadata enrichment, thereby enhancing dataset retrieval and the Findability, Accessibility, Interoperability and Reusability (FAIR) of research data on the Web.

正則化項 · 統計量 · 優化器 · 生成模型 · MoDELS ·

2024 年 3 月 1 日

Training generative models from privatized data

Daria Reshetova,Wei-Ning Chen,Ayfer ?zgür

Local differential privacy is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data. We show that entropic regularization of optimal transport - a popular regularization method in the literature that has often been leveraged for its computational benefits - enables the generator to learn the raw (unprivatized) data distribution even though it only has access to privatized samples. We prove that at the same time this leads to fast statistical convergence at the parametric rate. This shows that entropic regularization of optimal transport uniquely enables the mitigation of both the effects of privatization noise and the curse of dimensionality in statistical convergence. We provide experimental evidence to support the efficacy of our framework in practice.

逆強化學習 · 控制器 · MoDELS · Integration · Learning ·

2024 年 2 月 29 日

ARMCHAIR: integrated inverse reinforcement learning and model predictive control for human-robot collaboration

Angelo Caregnato-Neto,Luciano Cavalcante Siebert,Arkady Zgonnikov,Marcos Ricardo Omena de Albuquerque Maximo,Rubens Junqueira Magalh?es Afonso

One of the key issues in human-robot collaboration is the development of computational models that allow robots to predict and adapt to human behavior. Much progress has been achieved in developing such models, as well as control techniques that address the autonomy problems of motion planning and decision-making in robotics. However, the integration of computational models of human behavior with such control techniques still poses a major challenge, resulting in a bottleneck for efficient collaborative human-robot teams. In this context, we present a novel architecture for human-robot collaboration: Adaptive Robot Motion for Collaboration with Humans using Adversarial Inverse Reinforcement learning (ARMCHAIR). Our solution leverages adversarial inverse reinforcement learning and model predictive control to compute optimal trajectories and decisions for a mobile multi-robot system that collaborates with a human in an exploration task. During the mission, ARMCHAIR operates without human intervention, autonomously identifying the necessity to support and acting accordingly. Our approach also explicitly addresses the network connectivity requirement of the human-robot team. Extensive simulation-based evaluations demonstrate that ARMCHAIR allows a group of robots to safely support a simulated human in an exploration scenario, preventing collisions and network disconnections, and improving the overall performance of the task.

線性的 · 優化器 · 穩健性 · 約束 · 噪聲 ·

2024 年 2 月 28 日

Linear shrinkage for optimization in high dimensions

Naqi Huang,Nestor Parolya,Thereisa van Essen

In large-scale, data-driven applications, parameters are often only known approximately due to noise and limited data samples. In this paper, we focus on high-dimensional optimization problems with linear constraints under uncertain conditions. To find high quality solutions for which the violation of the true constraints is limited, we develop a linear shrinkage method that blends random matrix theory and robust optimization principles. It aims to minimize the Frobenius distance between the estimated and the true parameter matrix, especially when dealing with a large and comparable number of constraints and variables. This data-driven method excels in simulations, showing superior noise resilience and more stable performance in both obtaining high quality solutions and adhering to the true constraints compared to traditional robust optimization. Our findings highlight the effectiveness of our method in improving the robustness and reliability of optimization in high-dimensional, data-driven scenarios.

entity · 圖 · 知識圖譜 · MoDELS · 鏈路預測 ·

2020 年 8 月 10 日

A survey of embedding models of entities and relationships for knowledge graph completion

Dat Quoc Nguyen

from arxiv, 13 pages, 2 figures and 6 tables

Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.

contrastive · 學成 · 對比學習 · Extensibility · SSL ·

2020 年 6 月 18 日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Krishna Chaitanya,Ertunc Erdil,Neerav Karani,Ender Konukoglu

from arxiv, 16 pages, 2 figures, 7 tables. This article is a pre-print and is currently under review at a conference

A key requirement for the success of supervised deep learning is a large labeled dataset - a condition that is difficult to meet in medical image analysis. Self-supervised learning (SSL) can help in this regard by providing a strategy to pre-train a neural network with unlabeled data, followed by fine-tuning for a downstream task with limited annotations. Contrastive learning, a particular variant of SSL, is a powerful technique for learning image-level representations. In this work, we propose strategies for extending the contrastive learning framework for segmentation of volumetric medical images in the semi-supervised setting with limited annotations, by leveraging domain-specific and problem-specific cues. Specifically, we propose (1) novel contrasting strategies that leverage structural similarity across volumetric medical images (domain-specific cue) and (2) a local version of the contrastive loss to learn distinctive representations of local regions that are useful for per-pixel segmentation (problem-specific cue). We carry out an extensive evaluation on three Magnetic Resonance Imaging (MRI) datasets. In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques. When combined with a simple data augmentation technique, the proposed method reaches within 8% of benchmark performance using only two labeled MRI volumes for training, corresponding to only 4% (for ACDC) of the training data used to train the benchmark.