亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='6f8d1'><del id='6f8d1'><del id='6f8d1'></del><pre id='6f8d1'><pre id='6f8d1'><option id='6f8d1'><address id='6f8d1'></address><bdo id='6f8d1'><tr id='6f8d1'><acronym id='6f8d1'><pre id='6f8d1'></pre></acronym><div id='6f8d1'></div></tr></bdo></option></pre><small id='6f8d1'><address id='6f8d1'><u id='6f8d1'><legend id='6f8d1'><option id='6f8d1'><abbr id='6f8d1'></abbr><li id='6f8d1'><pre id='6f8d1'></pre></li></option></legend><select id='6f8d1'></select></u></address></small></pre></del><sup id='6f8d1'></sup><blockquote id='6f8d1'><dt id='6f8d1'></dt></blockquote><blockquote id='6f8d1'></blockquote></dir><tt id='6f8d1'></tt><u id='6f8d1'><tt id='6f8d1'><form id='6f8d1'></form></tt><td id='6f8d1'><dt id='6f8d1'></dt></td></u>

<code id='6f8d1'><i id='6f8d1'><q id='6f8d1'><legend id='6f8d1'><pre id='6f8d1'><style id='6f8d1'><acronym id='6f8d1'><i id='6f8d1'><form id='6f8d1'><option id='6f8d1'><center id='6f8d1'></center></option></form></i></acronym></style><tt id='6f8d1'></tt></pre></legend></q></i></code><center id='6f8d1'></center>

<dd id='6f8d1'></dd>

<style id='6f8d1'></style><sub id='6f8d1'><dfn id='6f8d1'><abbr id='6f8d1'><big id='6f8d1'><bdo id='6f8d1'></bdo></big></abbr></dfn></sub>_{<dir id='6f8d1'></dir>}

·

MoDELS · Transformer模型 · 變換 · Performer · 相關系數 ·

2023 年 3 月 22 日

Evaluating Transformer Models and Human Behaviors on Chinese Character Naming

Xiaomeng Ma,Lingyu Gao

from arxiv, Accepted by TACL

Neural network models have been proposed to explain the grapheme-phoneme mapping process in humans for many alphabet languages. These models not only successfully learned the correspondence of the letter strings and their pronunciation, but also captured human behavior in nonce word naming tasks. How would the neural models perform for a non-alphabet language (e.g., Chinese) unknown character task? How well would the model capture human behavior? In this study, we evaluate a set of transformer models and compare their performances with human behaviors on an unknown Chinese character naming task. We found that the models and humans behaved very similarly, that they had similar accuracy distribution for each character, and had a substantial overlap in answers. In addition, the models' answers are highly correlated with humans' answers. These results suggested that the transformer models can well capture human's character naming behavior.

相關內容

MoDELS

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · echo回聲（移動應用） · 分解的 · 在線 · 可辨認的 ·

2023 年 5 月 12 日

The drivers of online polarization: fitting models to data

Carlo Michele Valensise,Matteo Cinelli,Walter Quattrociocchi

from arxiv, Accepted for publication in Information Sciences

Users online tend to join polarized groups of like-minded peers around shared narratives, forming echo chambers. The echo chamber effect and opinion polarization may be driven by several factors including human biases in information consumption and personalized recommendations produced by feed algorithms. Until now, studies have mainly used opinion dynamic models to explore the mechanisms behind the emergence of polarization and echo chambers. The objective was to determine the key factors contributing to these phenomena and identify their interplay. However, the validation of model predictions with empirical data still displays two main drawbacks: lack of systematicity and qualitative analysis. In our work, we bridge this gap by providing a method to numerically compare the opinion distributions obtained from simulations with those measured on social media. To validate this procedure, we develop an opinion dynamic model that takes into account the interplay between human and algorithmic factors. We subject our model to empirical testing with data from diverse social media platforms and benchmark it against two state-of-the-art models. To further enhance our understanding of social media platforms, we provide a synthetic description of their characteristics in terms of the model's parameter space. This representation has the potential to facilitate the refinement of feed algorithms, thus mitigating the detrimental effects of extreme polarization on online discourse.

Boosting（一種模型訓練加速方式） · Attention · Learning · 泛函 · Performer ·

2023 年 5 月 12 日

Boosting Value Decomposition via Unit-Wise Attentive State Representation for Cooperative Multi-Agent Reinforcement Learning

Qingpeng Zhao,Yuanyang Zhu,Zichuan Liu,Zhi Wang,Chunlin Chen

In cooperative multi-agent reinforcement learning (MARL), the environmental stochasticity and uncertainties will increase exponentially when the number of agents increases, which puts hard pressure on how to come up with a compact latent representation from partial observation for boosting value decomposition. To tackle these issues, we propose a simple yet powerful method that alleviates partial observability and efficiently promotes coordination by introducing the UNit-wise attentive State Representation (UNSR). In UNSR, each agent learns a compact and disentangled unit-wise state representation outputted from transformer blocks, and produces its local action-value function. The proposed UNSR is used to boost the value decomposition with a multi-head attention mechanism for producing efficient credit assignment in the mixing network, providing an efficient reasoning path between the individual value function and joint value function. Experimental results demonstrate that our method achieves superior performance and data efficiency compared to solid baselines on the StarCraft II micromanagement challenge. Additional ablation experiments also help identify the key factors contributing to the performance of UNSR.

可辨認的 · 模型評估 · 全 · 對數幾率回歸 · 留一法 ·

2023 年 5 月 11 日

Using Full-Text Content to Characterize and Identify Best Seller Books

Giovana D. da Silva,Filipi N. Silva,Henrique F. de Arruda,Bárbara C. e Souza,Luciano da F. Costa,Diego R. Amancio

Artistic pieces can be studied from several perspectives, one example being their reception among readers over time. In the present work, we approach this interesting topic from the standpoint of literary works, particularly assessing the task of predicting whether a book will become a best seller. Dissimilarly from previous approaches, we focused on the full content of books and considered visualization and classification tasks. We employed visualization for the preliminary exploration of the data structure and properties, involving SemAxis and linear discriminant analyses. Then, to obtain quantitative and more objective results, we employed various classifiers. Such approaches were used along with a dataset containing (i) books published from 1895 to 1924 and consecrated as best sellers by the Publishers Weekly Bestseller Lists and (ii) literary works published in the same period but not being mentioned in that list. Our comparison of methods revealed that the best-achieved result - combining a bag-of-words representation with a logistic regression classifier - led to an average accuracy of 0.75 both for the leave-one-out and 10-fold cross-validations. Such an outcome suggests that it is unfeasible to predict the success of books with high accuracy using only the full content of the texts. Nevertheless, our findings provide insights into the factors leading to the relative success of a literary work.

成比例 · HTTPS · binary · 評論員 · 統計量 ·

2023 年 5 月 11 日

Still no evidence for an effect of the proportion of non-native speakers on language complexity -- A response to Kauhanen, Einhaus & Walkden (2023)

Alexander Koplenig

from arxiv, v5 - fixed some typos and inaccuracies

In a recent paper published in the Journal of Language Evolution, Kauhanen, Einhaus & Walkden (//doi.org/10.1093/jole/lzad005, KEW) challenge the results presented in one of my papers (Koplenig, Royal Society Open Science, 6, 181274 (2019), //doi.org/10.1098/rsos.181274), in which I tried to show through a series of statistical analyses that large numbers of L2 (second language) speakers do not seem to affect the (grammatical or statistical) complexity of a language. To this end, I focus on the way in which the Ethnologue assesses language status: a language is characterised as vehicular if, in addition to being used by L1 (first language) speakers, it should also have a significant number of L2 users. KEW criticise both the use of vehicularity as a (binary) indicator of whether a language has a significant number of L2 users and the idea of imputing a zero proportion of L2 speakers to non-vehicular languages whenever a direct estimate of that proportion is unavailable. While I recognise the importance of post-publication commentary on published research, I show in this rejoinder that both points of criticism are explicitly mentioned and analysed in my paper. In addition, I also comment on other points raised by KEW and demonstrate that both alternative analyses offered by KEW do not stand up to closer scrutiny.

自頂向下 · MoDELS · 循環神經網絡 · Cognition · Networking ·

2023 年 5 月 11 日

Modeling Human Sentence Processing with Left-Corner Recurrent Neural Network Grammars

Ryo Yoshida,Hiroshi Noji,Yohei Oseki

from arxiv, Accepted by EMNLP 2021

In computational linguistics, it has been shown that hierarchical structures make language models (LMs) more human-like. However, the previous literature has been agnostic about a parsing strategy of the hierarchical models. In this paper, we investigated whether hierarchical structures make LMs more human-like, and if so, which parsing strategy is most cognitively plausible. In order to address this question, we evaluated three LMs against human reading times in Japanese with head-final left-branching structures: Long Short-Term Memory (LSTM) as a sequential model and Recurrent Neural Network Grammars (RNNGs) with top-down and left-corner parsing strategies as hierarchical models. Our computational modeling demonstrated that left-corner RNNGs outperformed top-down RNNGs and LSTM, suggesting that hierarchical and left-corner architectures are more cognitively plausible than top-down or sequential architectures. In addition, the relationships between the cognitive plausibility and (i) perplexity, (ii) parsing, and (iii) beam size will also be discussed.

語音識別 · 監督 · 穩健性 · Neural Networks · 可辨認的 ·

2023 年 5 月 11 日

Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla

Ahnaf Mozib Samin,M. Humayon Kobir,Md. Mushtaq Shahriyar Rafee,M. Firoz Ahmed,Mehedi Hasan,Partha Ghosh,Shafkat Kibria,M. Shahidur Rahman

Despite huge improvements in automatic speech recognition (ASR) employing neural networks, ASR systems still suffer from a lack of robustness and generalizability issues due to domain shifting. This is mainly because principal corpus design criteria are often not identified and examined adequately while compiling ASR datasets. In this study, we investigate the robustness of the state-of-the-art transfer learning approaches such as self-supervised wav2vec 2.0 and weakly supervised Whisper as well as fully supervised convolutional neural networks (CNNs) for multi-domain ASR. We also demonstrate the significance of domain selection while building a corpus by assessing these models on a novel multi-domain Bangladeshi Bangla ASR evaluation benchmark - BanSpeech, which contains approximately 6.52 hours of human-annotated speech and 8085 utterances from 13 distinct domains. SUBAK.KO, a mostly read speech corpus for the morphologically rich language Bangla, has been used to train the ASR systems. Experimental evaluation reveals that self-supervised cross-lingual pre-training is the best strategy compared to weak supervision and full supervision to tackle the multi-domain ASR task. Moreover, the ASR models trained on SUBAK.KO face difficulty recognizing speech from domains with mostly spontaneous speech. The BanSpeech will be publicly available to meet the need for a challenging evaluation benchmark for Bangla ASR.

語言模型化 · 知識 (knowledge) · 可辨認的 · 基 · 知識庫 ·

2023 年 5 月 10 日

ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base

Siyu Yuan,Jiangjie Chen,Changzhi Sun,Jiaqing Liang,Yanghua Xiao,Deqing Yang

Analogical reasoning is a fundamental cognitive ability of humans. However, current language models (LMs) still struggle to achieve human-like performance in analogical reasoning tasks due to a lack of resources for model training. In this work, we address this gap by proposing ANALOGYKB, a million-scale analogy knowledge base (KB) derived from existing knowledge graphs (KGs). ANALOGYKB identifies two types of analogies from the KGs: 1) analogies of the same relations, which can be directly extracted from the KGs, and 2) analogies of analogous relations, which are identified with a selection and filtering pipeline enabled by large LMs (InstructGPT), followed by minor human efforts for data quality control. Evaluations on a series of datasets of two analogical reasoning tasks (analogy recognition and generation) demonstrate that ANALOGYKB successfully enables LMs to achieve much better results than previous state-of-the-art methods.

MoDELS · Performer · Processing（編程語言） · 學成 · 穩健性 ·

2021 年 9 月 3 日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

from arxiv, PhD thesis

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

基于上下文的表示 · 模型評估 · 學成 · 詞向量表示 · 層 ·

2018 年 8 月 27 日

Dissecting Contextual Word Embeddings: Architecture and Representation

Matthew E. Peters,Mark Neumann,Luke Zettlemoyer,Wen-tau Yih

from arxiv, EMNLP 2018

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.

有向 · 注意力機制 · 可理解性 · 模型評估 · Networking ·

2017 年 11 月 20 日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Tao Shen,Tianyi Zhou,Guodong Long,Jing Jiang,Shirui Pan,Chengqi Zhang

from arxiv, 10 pages, 8 figures; Accepted in AAAI-18

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)", is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Transformer模型

相關系數(shu)

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='6f8d1'><strong id='6f8d1'></strong><small id='6f8d1'></small><button id='6f8d1'></button><li id='6f8d1'><noscript id='6f8d1'><big id='6f8d1'></big><dt id='6f8d1'></dt></noscript></li></tr><ol id='6f8d1'><option id='6f8d1'><table id='6f8d1'><blockquote id='6f8d1'><tbody id='6f8d1'></tbody></blockquote></table></option></ol><u id='6f8d1'></u><kbd id='6f8d1'><kbd id='6f8d1'></kbd></kbd>

<code id='6f8d1'><strong id='6f8d1'></strong></code>

<fieldset id='6f8d1'></fieldset>

<span id='6f8d1'></span>

<ins id='6f8d1'></ins>

<acronym id='6f8d1'><em id='6f8d1'></em><td id='6f8d1'><div id='6f8d1'></div></td></acronym><address id='6f8d1'><big id='6f8d1'><big id='6f8d1'></big><legend id='6f8d1'></legend></big></address>

<i id='6f8d1'><div id='6f8d1'><ins id='6f8d1'></ins></div></i>

<i id='6f8d1'></i>