苍井空无码免费换线,在线观看成年视频2020最新,成人免费午夜剧场,伊人成综合网伊人222,色综合久久无码中文字幕APP

Named Entity Recognition seeks to extract substrings within a text that name real-world objects and to determine their type (for example, whether they refer to persons or organizations). In this survey, we first present an overview of recent popular approaches, but we also look at graph- and transformer- based methods including Large Language Models (LLMs) that have not had much coverage in other surveys. Second, we focus on methods designed for datasets with scarce annotations. Third, we evaluate the performance of the main NER implementations on a variety of datasets with differing characteristics (as regards their domain, their size, and their number of classes). We thus provide a deep comparison of algorithms that are never considered together. Our experiments shed some light on how the characteristics of datasets affect the behavior of the methods that we compare.

知識薈萃

精品入門和進階教程、論文和代碼整理等

查看相關VIP內容、論文、資訊等

簇 · Analysis · 論文 · 統計理論 ·

2024 年 3 月 4 日

Extremal quantiles of intermediate orders under two-way clustering

Harold D. Chiang,Ryutah Kato,Yuya Sasaki

This paper investigates extremal quantiles under two-way cluster dependence. We demonstrate that the limiting distribution of the unconditional intermediate order quantiles in the tails converges to a Gaussian distribution. This is remarkable as two-way cluster dependence entails potential non-Gaussianity in general, but extremal quantiles do not suffer from this issue. Building upon this result, we extend our analysis to extremal quantile regressions of intermediate order.

Projection · CSS · 同質 ·

2024 年 3 月 1 日

The hull variation problem for projective Reed-Muller codes and quantum error-correcting codes

Diego Ruano,Rodrigo San-José

Long quantum codes using projective Reed-Muller codes are constructed. Projective Reed-Muller are evaluation codes obtained by evaluating homogeneous polynomials at the projective space. We obtain asymmetric and symmetric quantum codes by using the CSS construction and the Hermitian construction, respectively. We provide entanglement-assisted quantum error-correcting codes from projective Reed-Muller codes with flexible amounts of entanglement by considering equivalent codes. Moreover, we also construct quantum codes from subfield subcodes of projective Reed-Muller codes.

語言模型化 · MoDELS · 有偏 · INTERACT · INFORMS ·

2024 年 3 月 1 日

Dialect prejudice predicts AI decisions about people's character, employability, and criminality

Valentin Hofmann,Pratyusha Ria Kalluri,Dan Jurafsky,Sharese King

Hundreds of millions of people now interact with language models, with uses ranging from serving as a writing aid to informing hiring decisions. Yet these language models are known to perpetuate systematic racial prejudices, making their judgments biased in problematic ways about groups like African Americans. While prior research has focused on overt racism in language models, social scientists have argued that racism with a more subtle character has developed over time. It is unknown whether this covert racism manifests in language models. Here, we demonstrate that language models embody covert racism in the form of dialect prejudice: we extend research showing that Americans hold raciolinguistic stereotypes about speakers of African American English and find that language models have the same prejudice, exhibiting covert stereotypes that are more negative than any human stereotypes about African Americans ever experimentally recorded, although closest to the ones from before the civil rights movement. By contrast, the language models' overt stereotypes about African Americans are much more positive. We demonstrate that dialect prejudice has the potential for harmful consequences by asking language models to make hypothetical decisions about people, based only on how they speak. Language models are more likely to suggest that speakers of African American English be assigned less prestigious jobs, be convicted of crimes, and be sentenced to death. Finally, we show that existing methods for alleviating racial bias in language models such as human feedback training do not mitigate the dialect prejudice, but can exacerbate the discrepancy between covert and overt stereotypes, by teaching language models to superficially conceal the racism that they maintain on a deeper level. Our findings have far-reaching implications for the fair and safe employment of language technology.

連結 · 泛函 · 泛化理論 · 樣例 · 相似度 ·

2024 年 3 月 1 日

Introducing locality in some generalized AG codes

Bastien Pacifico

from arxiv, 18 pages

In 1999, Xing, Niederreiter and Lam introduced a generalization of AG codes using the evaluation at non-rational places of a function field. In this paper, we show that one can obtain a locality parameter $r$ in such codes by using only non-rational places of degrees at most $r$. This is, up to the author's knowledge, a new way to construct locally recoverable codes (LRCs). We give an example of such a code reaching the Singleton-like bound for LRCs, and show the parameters obtained for some longer codes over $\mathbb F_3$. We then investigate similarities with certain concatenated codes. Contrary to previous methods, our construction allows one to obtain directly codes whose dimension is not a multiple of the locality. Finally, we give an asymptotic study using the Garcia-Stichtenoth tower of function fields, for both our construction and a construction of concatenated codes. We give explicit infinite families of LRCs with locality 2 over any finite field of cardinality greater than 3 following our new approach.

cancer · 分解的 · MoDELS · 異方差 · 可辨認的 ·

2024 年 3 月 1 日

Creating area level indices of behaviours impacting cancer in Australia with a Bayesian generalised shared component model

James Hogg,Susanna Cramb,Jessica Cameron,Peter Baade,Kerrie Mengersen

from arxiv, Submitted to Health and Place

This study develops a model-based index creation approach called the Generalized Shared Component Model (GSCM) by drawing on the large field of factor models. The proposed fully Bayesian approach accommodates heteroscedastic model error, multiple shared factors and flexible spatial priors. Moreover, our model, unlike previous index approaches, provides indices with uncertainty. Focusing on Australian risk factor data, the proposed GSCM is used to develop the Area Indices of Behaviors Impacting Cancer product - representing the first area level cancer risk factor index in Australia. This advancement aids in identifying communities with elevated cancer risk, facilitating targeted health interventions.

潛變量/隱變量 · 置換 · 估計/估計量 · 潛在 · MoDELS ·

2024 年 3 月 1 日

Substitute adjustment via recovery of latent variables

Jeffrey Adams,Niels Richard Hansen

The deconfounder was proposed as a method for estimating causal parameters in a context with multiple causes and unobserved confounding. It is based on recovery of a latent variable from the observed causes. We disentangle the causal interpretation from the statistical estimation problem and show that the deconfounder in general estimates adjusted regression target parameters. It does so by outcome regression adjusted for the recovered latent variable termed the substitute. We refer to the general algorithm, stripped of causal assumptions, as substitute adjustment. We give theoretical results to support that substitute adjustment estimates adjusted regression parameters when the regressors are conditionally independent given the latent variable. We also introduce a variant of our substitute adjustment algorithm that estimates an assumption-lean target parameter with minimal model assumptions. We then give finite sample bounds and asymptotic results supporting substitute adjustment estimation in the case where the latent variable takes values in a finite set. A simulation study illustrates finite sample properties of substitute adjustment. Our results support that when the latent variable model of the regressors hold, substitute adjustment is a viable method for adjusted regression.

推斷 · 統計量 · 有偏 · 應用統計 · 統計方法 ·

2024 年 2 月 29 日

An improved BISG for inferring race from surname and geolocation

Philip Greengard,Andrew Gelman

Bayesian Improved Surname Geocoding (BISG) is a ubiquitous tool for predicting race and ethnicity using an individual's geolocation and surname. Here we demonstrate that statistical dependence of surname and geolocation within racial/ethnic categories in the United States results in biases for minority subpopulations, and we introduce a raking-based improvement. Our method augments the data used by BISG--distributions of race by geolocation and race by surname--with the distribution of surname by geolocation obtained from state voter files. We validate our algorithm on state voter registration lists that contain self-identified race/ethnicity.

置換 · SimPLe · 設計 · 表示 ·

2024 年 2 月 29 日

Variable binding and substitution for (nameless) dummies

André Hirschowitz,Tom Hirschowitz,Ambroise Lafont,Marco Maggesi

By abstracting over well-known properties of De Bruijn's representation with nameless dummies, we design a new theory of syntax with variable binding and capture-avoiding substitution. We propose it as a simpler alternative to Fiore, Plotkin, and Turi's approach, with which we establish a strong formal link. We also show that our theory easily incorporates simple types and equations between terms.

Integration · 圖 · 相同 · 數值分析 ·

2024 年 2 月 29 日

Measure preservation and integrals for Lotka--Volterra tree-systems and their Kahan discretisation

Peter H. van der Kamp,Robert I. McLachlan,David I. McLaren,G. R. W. Quispel

from arxiv, 17 pages, 3 figures

We show that any Lotka--Volterra tree-system associated with an $n$-vertex tree, as introduced in Quispel et al., J. Phys. A 56 (2023) 315201, preserves a rational measure. We also prove that the Kahan discretisation of these tree-systems factorises and preserves the same measure. As a consequence, for the Kahan maps of Lotka--Volterra systems related to the subclass of tree-systems corresponding to graphs with more than one $n$-vertex subtree, we are able to construct rational integrals.

哈希學習 · 圖像檢索 · 卷積神經網絡 · 優化器 · 損失函數（機器學習） ·

2020 年 6 月 10 日

A survey on deep hashing for image retrieval

Xiaopeng Zhang

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.