又大又硬又长又粗免费看_露脸视频一区二区三区在线播放_国产又爽又黄的视频又刺激_香港三级韩国三级日本三级_国产在线视频一区二区三区97_黑人大迪克黑吊视频_久久久久久精品国产亚洲一

from arxiv, Camera-ready version, to be presented at the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper describes how we train BERT models to carry over a coding system developed on the paragraphs of a Hungarian literary journal to another. The aim of the coding system is to track trends in the perception of literary translation around the political transformation in 1989 in Hungary. To evaluate not only task performance but also the consistence of the annotation, moreover, to get better predictions from an ensemble, we use 10-fold crossvalidation. Extensive hyperparameter tuning is used to obtain the best possible results and fair comparisons. To handle label imbalance, we use loss functions and metrics robust to it. Evaluation of the effect of domain shift is carried out by sampling a test set from the target domain. We establish the sample size by estimating the bootstrapped confidence interval via simulations. This way, we show that our models can carry over one annotation system to the target domain. Comparisons are drawn to provide insights such as learning multilabel correlations and confidence penalty improve resistance to domain shift, and domain adaptation on OCR-ed text on another domain improves performance almost to the same extent as that on the corpus under study. See our code at //codeberg.org/zsamboki/bert-annotator-ensemble.

相關內容

Performer

關注 10

LastPass · 分解的 · Analysis · Integration · Cognition ·

2024 年 5 月 9 日

Human Factors in the LastPass Breach

Niroop Sugunaraj

This paper examines the complex nature of cyber attacks through an analysis of the LastPass breach. It argues for the integration of human-centric considerations into cybersecurity measures, focusing on mitigating factors such as goal-directed behavior, cognitive overload, human biases (e.g., optimism, anchoring), and risky behaviors. Findings from an analysis of this breach offers support to the perspective that addressing both the human and technical dimensions of cyber defense can significantly enhance the resilience of cyber systems against complex threats. This means maintaining a balanced approach while simultaneously simplifying user interactions, making users aware of biases, and discouraging risky practices are essential for preventing cyber incidents.

語言模型化 · 有偏 · MoDELS · HCI · 大語言模型 ·

2024 年 5 月 8 日

Concerns on Bias in Large Language Models when Creating Synthetic Personae

Helena A. Haxvig

from arxiv, 4 pages, accepted at the "LLM-Based Synthetic Personae and Data in HCI" workshop at CHI2024

This position paper explores the benefits, drawbacks, and ethical considerations of incorporating synthetic personae in HCI research, particularly focusing on the customization challenges beyond the limitations of current Large Language Models (LLMs). These perspectives are derived from the initial results of a sub-study employing vignettes to showcase the existence of bias within black-box LLMs and explore methods for manipulating them. The study aims to establish a foundation for understanding the challenges associated with these models, emphasizing the necessity of thorough testing before utilizing them to create synthetic personae for HCI research.

Integration · Networking · Performer · 通道 · 概率密度函數 ·

2024 年 5 月 8 日

On Stochastic Fundamental Limits in a Downlink Integrated Sensing and Communication Network

Marziyeh Soltani,Mahtab Mirmohseni,Rahim Tafazolli

from arxiv, arXiv admin note: text overlap with arXiv:2305.15388

This paper aims to analyze the stochastic performance of a multiple input multiple output (MIMO) integrated sensing and communication (ISAC) system in a downlink scenario, where a base station (BS) transmits a dual-functional radar-communication (DFRC) signal matrix, serving the purpose of transmitting communication data to the user while simultaneously sensing the angular location of a target. The channel between the BS and the user is modeled as a random channel with Rayleigh fading distribution, and the azimuth angle of the target is assumed to follow a uniform distribution. Due to the randomness inherent in the network, the challenge is to consider suitable performance metrics for this randomness. To address this issue, for users, we employ the user's rate outage probability (OP) and ergodic rate, while for target, we propose using the OP of the Cram\'er-Rao lower bound (CRLB) for the angle of arrival and the ergodic CRLB. We have obtained the expressions of these metrics for scenarios where the BS employs two different beamforming methods. Our approach to deriving these metrics involves computing the probability density function (PDF) of the signal-to-noise ratio for users and the CRLB for the target. We have demonstrated that the central limit theorem provides a viable approach for deriving these PDFs. In our numerical results, we demonstrate the trade-off between sensing and communication (S \& C) by characterizing the region of S \& C metrics and by obtaining the Pareto optimal boundary points, confirmed with simulations.

Processing（編程語言） · 圖 · Performer · 秩 · 優化器 ·

2024 年 5 月 8 日

Rhizomes and Diffusions for Processing Highly Skewed Graphs on Fine-Grain Message-Driven Systems

Bibrak Qamar Chandio,Prateek Srivastava,Maciej Brodowicz,Martin Swany,Thomas Sterling

from arxiv, arXiv admin note: text overlap with arXiv:2402.02576

The paper provides a unified co-design of 1) a programming and execution model that allows spawning tasks from within the vertex data at runtime, 2) language constructs for \textit{actions} that send work to where the data resides, combining parallel expressiveness of local control objects (LCOs) to implement asynchronous graph processing primitives, 3) and an innovative vertex-centric data-structure, using the concept of Rhizomes, that parallelizes both the out and in-degree load of vertex objects across many cores and yet provides a single programming abstraction to the vertex objects. The data structure hierarchically parallelizes the out-degree load of vertices and the in-degree load laterally. The rhizomes internally communicate and remain consistent, using event-driven synchronization mechanisms, to provide a unified and correct view of the vertex. Simulated experimental results show performance gains for BFS, SSSP, and Page Rank on large chip sizes for the tested input graph datasets containing highly skewed degree distribution. The improvements come from the ability to express and create fine-grain dynamic computing task in the form of \textit{actions}, language constructs that aid the compiler to generate code that the runtime system uses to optimally schedule tasks, and the data structure that shares both in and out-degree compute workload among memory-processing elements.

MoDELS · Learning · Extensibility · Performer · Continuity ·

2024 年 5 月 7 日

Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning

Karim Galliamov,Leila Khaertdinova,Karina Denisova

from arxiv, 17 pages, 4 figures, Accepted to AINL-2024

The latest developments in Natural Language Processing (NLP) have demonstrated remarkable progress in a code-text retrieval problem. As the Transformer-based models used in this task continue to increase in size, the computational costs and time required for end-to- end fine-tuning become substantial. This poses a significant challenge for adapting and utilizing these models when computational resources are limited. Motivated by these concerns, we propose a fine-tuning frame- work that leverages Parameter-Efficient Fine-Tuning (PEFT) techniques. Moreover, we adopt contrastive learning objectives to improve the quality of bimodal representations learned by transformer models. Additionally, for PEFT methods we provide extensive benchmarking, the lack of which has been highlighted as a crucial problem in the literature. Based on the thorough experimentation with the CodeT5+ model conducted on two datasets, we demonstrate that the proposed fine-tuning framework has the potential to improve code-text retrieval performance by tuning only 0.4% parameters at most.

Principle · 線性的 · FAN · 穩健性 · 樣例 ·

2024 年 5 月 6 日

A Heteroskedasticity-Robust Overidentifying Restriction Test with High-Dimensional Covariates

Qingliang Fan,Zijian Guo,Ziwei Mei

This paper proposes an overidentifying restriction test for high-dimensional linear instrumental variable models. The novelty of the proposed test is that it allows the number of covariates and instruments to be larger than the sample size. The test is scale-invariant and is robust to heteroskedastic errors. To construct the final test statistic, we first introduce a test based on the maximum norm of multiple parameters that could be high-dimensional. The theoretical power based on the maximum norm is higher than that in the modified Cragg-Donald test (Koles\'{a}r, 2018), the only existing test allowing for large-dimensional covariates. Second, following the principle of power enhancement (Fan et al., 2015), we introduce the power-enhanced test, with an asymptotically zero component used to enhance the power to detect some extreme alternatives with many locally invalid instruments. Finally, an empirical example of the trade and economic growth nexus demonstrates the usefulness of the proposed test.

INFORMS · 大語言模型 · IR · 信息檢索 · 自動問答 ·

2024 年 5 月 6 日

When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively

Tiziano Labruna,Jon Ander Campos,Gorka Azkune

In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the parametric memory of the LLM itself. Prior research has identified this phenomenon in the PopQA dataset, wherein the most popular questions are effectively addressed using the LLM's parametric memory, while less popular ones require IR system usage. Following this, we propose a tailored training approach for LLMs, leveraging existing open-domain question answering datasets. Here, LLMs are trained to generate a special token, <RET>, when they do not know the answer to a question. Our evaluation of the Adaptive Retrieval LLM (Adapt-LLM) on the PopQA dataset showcases improvements over the same LLM under three configurations: (i) retrieving information for all the questions, (ii) using always the parametric memory of the LLM, and (iii) using a popularity threshold to decide when to use a retriever. Through our analysis, we demonstrate that Adapt-LLM is able to generate the <RET> token when it determines that it does not know how to answer a question, indicating the need for IR, while it achieves notably high accuracy levels when it chooses to rely only on its parametric memory.

FRN · INFORMS · Networking · MoDELS · 學成 ·

2021 年 4 月 12 日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, YanYan,Shenqi Lai,Zhenhua Chai,Chunhua Shen,Hanzi Wang

from arxiv, IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (CVPR 2021)

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.

元學習 · 語音識別 · MAML · 學成 · 端到端 ·

2019 年 10 月 26 日

Meta Learning for End-to-End Low-Resource Speech Recognition

Jui-Yang Hsu,Yuan-Jui Chen,Hung-yi Lee

from arxiv, 5 pages, submitted to ICASSP 2020

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

視覺問答 · 自動問答 · MoDELS · 可辨認的 · 注意力機制 ·

2018 年 2 月 15 日

Learning to Count Objects in Natural Images for Visual Question Answering

Yan Zhang,Jonathon Hare,Adam Prügel-Bennett

from arxiv, Published in ICLR 2018

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.