亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='Yhme8'><del id='4Cqqk'><del id='189Mc'></del><pre id='iO9sv'><pre id='RpFGW'><option id='780xT'><address id='suM4b'></address><bdo id='KtM7D'><tr id='mecwT'><acronym id='XoR24'><pre id='5uhNg'></pre></acronym><div id='IMQrb'></div></tr></bdo></option></pre><small id='6D3H3'><address id='FvfLC'><u id='YbcZi'><legend id='NniW9'><option id='SqiEZ'><abbr id='8n9WD'></abbr><li id='8Nv5I'><pre id='TmFDo'></pre></li></option></legend><select id='Cigze'></select></u></address></small></pre></del><sup id='4ssED'></sup><blockquote id='uD4ss'><dt id='PvxS6'></dt></blockquote><blockquote id='swtiE'></blockquote></dir><tt id='lPhrz'></tt><u id='ASOIG'><tt id='JkTA6'><form id='P5Esi'></form></tt><td id='ETcL3'><dt id='yYWI6'></dt></td></u>

<code id='46D2H'><i id='uNMYS'><q id='Jyby7'><legend id='cAul6'><pre id='t3v3J'><style id='sWhub'><acronym id='fo3yu'><i id='C4udy'><form id='RioOS'><option id='YwX7E'><center id='SdIrc'></center></option></form></i></acronym></style><tt id='bKKro'></tt></pre></legend></q></i></code><center id='IFrWM'></center>

<dd id='KZBva'></dd>

<style id='ieAVz'></style><sub id='3SaBL'><dfn id='wXmU2'><abbr id='ap3uX'><big id='R6blt'><bdo id='I5y78'></bdo></big></abbr></dfn></sub>_{<dir id='9ueDV'></dir>}

·

MoDELS · 序列到序列學習 · Processing（編程語言） · seq2seq · Learning ·

2023 年 11 月 29 日

Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning

Lihua Qian,Mingxuan Wang,Yang Liu,Hao Zhou

from arxiv, 8 pages, 7 figures

Previously, non-autoregressive models were widely perceived as being superior in generation efficiency but inferior in generation quality due to the difficulties of modeling multiple target modalities. To enhance the multi-modality modeling ability, we propose the diffusion glancing transformer, which employs a modality diffusion process and residual glancing sampling. The modality diffusion process is a discrete process that interpolates the multi-modal distribution along the decoding steps, and the residual glancing sampling approach guides the model to continuously learn the remaining modalities across the layers. Experimental results on various machine translation and text generation benchmarks demonstrate that DIFFGLAT achieves better generation accuracy while maintaining fast decoding speed compared with both autoregressive and non-autoregressive models.

相關內容

MoDELS

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Performer · 簇 · Performance · 多樣性 ·

2024 年 1 月 19 日

Software Resource Disaggregation for HPC with Serverless Computing

Marcin Copik,Marcin Chrapek,Larissa Schmid,Alexandru Calotoiu,Torsten Hoefler

Aggregated HPC resources have rigid allocation systems and programming models which struggle to adapt to diverse and changing workloads. Consequently, HPC systems fail to efficiently use the large pools of unused memory and increase the utilization of idle computing resources. Prior work attempted to increase the throughput and efficiency of supercomputing systems through workload co-location and resource disaggregation. However, these methods fall short of providing a solution that can be applied to existing systems without major hardware modifications and performance losses. In this paper, we improve the utilization of supercomputers by employing the new cloud paradigm of serverless computing. We show how serverless functions provide fine-grained access to the resources of batch-managed cluster nodes. We present an HPC-oriented Function-as-a-Service (FaaS) that satisfies the requirements of high-performance applications. We demonstrate a \emph{software resource disaggregation} approach where placing functions on unallocated and underutilized nodes allows idle cores and accelerators to be utilized while retaining near-native performance.

語言模型化 · MoDELS · SimPLe · 詞元分析器 · 解碼 ·

2024 年 1 月 19 日

A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Jimin Hong,Gibbeum Lee,Jaewoong Cho

Recent advancements in large language models have facilitated the execution of complex language tasks, not only in English but also in non-English languages. However, the tokenizers of most language models, such as Llama, trained on English-centric corpora, tend to excessively fragment tokens in non-English languages. This issue is especially pronounced in non-roman alphabetic languages, which are often divided at a character or even Unicode level, leading to slower text generation. To address this, our study introduces a novel framework designed to expedite text generation in these languages. This framework predicts larger linguistic units than those of conventional multilingual tokenizers and is specifically tailored to the target language, thereby reducing the number of decoding steps required. Our empirical results demonstrate that the proposed framework increases the generation speed by a factor of 1.9 compared to standard decoding while maintaining the performance of a pre-trained multilingual model on monolingual tasks.

蒙特卡羅 · Performer · UniFormer · Learning · 統計量 ·

2024 年 1 月 19 日

Learning a Prior for Monte Carlo Search by Replaying Solutions to Combinatorial Problems

Tristan Cazenave

Monte Carlo Search gives excellent results in multiple difficult combinatorial problems. Using a prior to perform non uniform playouts during the search improves a lot the results compared to uniform playouts. Handmade heuristics tailored to the combinatorial problem are often used as priors. We propose a method to automatically compute a prior. It uses statistics on solved problems. It is a simple and general method that incurs no computational cost at playout time and that brings large performance gains. The method is applied to three difficult combinatorial problems: Latin Square Completion, Kakuro, and Inverse RNA Folding.

Performer · MoDELS · 流 · 語音識別 · LDA ·

2024 年 1 月 17 日

Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR

Junwen Bai,Bo Li,Qiujia Li,Tara N. Sainath,Trevor Strohman

from arxiv, Accepted to ICASSP 2024

The end-to-end ASR model is often desired in the streaming multilingual scenario since it is easier to deploy and can benefit from pre-trained speech models such as powerful foundation models. Meanwhile, the heterogeneous nature and imbalanced data abundance of different languages may cause performance degradation, leading to asynchronous peak performance for different languages during training, especially on tail ones. Sometimes even the data itself may become unavailable as a result of the enhanced privacy protection. Existing work tend to significantly increase the model size or learn language-specific decoders to accommodate each language separately. In this study, we explore simple yet effective Language-Dependent Adapter (LDA) finetuning under a cascaded Conformer transducer framework enhanced by teacher pseudo-labeling for tail languages in the streaming multilingual ASR. The adapter only accounts for 0.4% of the full model per language. It is plugged into the frozen foundation model and is the only trainable module during the finetuning process with noisy student training. The final model merges the adapter parameters from different checkpoints for different languages. The model performance is validated on a challenging multilingual dictation dataset, which includes 39 tail languages across Latin, Greek, Arabic, etc. Our proposed method brings 12.2% word error rate reduction on average and up to 37.5% on a single locale. Furthermore, we show that our parameter-efficient LDA can match the quality of the full model finetuning, thus greatly alleviating the asynchronous peak performance issue.

Minimax · 穩健性 · 異方差 · 相關系數 · 設計 ·

2024 年 1 月 16 日

A Note on Minimax Robustness of Designs Against Correlated or Heteroscedastic Responses

Douglas P. Wiens

We present a result according to which certain functions of covariance matrices are maximized at scalar multiples of the identity matrix. This is used to show that experimental designs that are optimal under an assumption of independent, homoscedastic responses can be minimax robust, in broad classes of alternate covariance structures. In particular it can justify the common practice of disregarding possible dependence, or heteroscedasticity, at the design stage of an experiment.

優化器 · 設計 · 操作 · Integration · prototype ·

2024 年 1 月 16 日

Battery-Swapping Multi-Agent System for Sustained Operation of Large Planetary Fleets

Ethan Holand,Jarrod Homer,Alex Storrer,Musheeera Khandeker,Ethan F. Muhlon,Maulik Patel,Ben-oni Vainqueur,David Antaki,Naomi Cooke,Chloe Wilson,Bahram Shafai,Nathaniel Hanson,Ta?k?n Pad?r

from arxiv, 15 pages, 12 figures. To be published in IEEE Aerospace Conference 2024

We propose a novel, heterogeneous multi-agent architecture that miniaturizes rovers by outsourcing power generation to a central hub. By delegating power generation and distribution functions to this hub, the size, weight, power, and cost (SWAP-C) per rover are reduced, enabling efficient fleet scaling. As these rovers conduct mission tasks around the terrain, the hub charges an array of replacement battery modules. When a rover requires charging, it returns to the hub to initiate an autonomous docking sequence and exits with a fully charged battery. This confers an advantage over direct charging methods, such as wireless or wired charging, by replenishing a rover in minutes as opposed to hours, increasing net rover uptime. This work shares an open-source platform developed to demonstrate battery swapping on unknown field terrain. We detail our design methodologies utilized for increasing system reliability, with a focus on optimization, robust mechanical design, and verification. Optimization of the system is discussed, including the design of passive guide rails through simulation-based optimization methods which increase the valid docking configuration space by 258%. The full system was evaluated during integrated testing, where an average servicing time of 98 seconds was achieved on surfaces with a gradient up to 10{\deg}. We conclude by briefly proposing flight considerations for advancing the system toward a space-ready design. In sum, this prototype represents a proof of concept for autonomous docking and battery transfer on field terrain, advancing its Technology Readiness Level (TRL) from 1 to 3.

圖 · Networking · 學成 · Performer · 深度學習 ·

2020 年 10 月 9 日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Emanuele Rossi,Ben Chamberlain,Fabrizio Frasca,Davide Eynard,Federico Monti,Michael Bronstein

Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.

命名實體識別 · entity · 學成 · 深度學習 · 可辨認的 ·

2020 年 3 月 13 日

A Survey on Deep Learning for Named Entity Recognition

Jing Li,Aixin Sun,Jianglei Han,Chenliang Li

from arxiv, 20 pages, 12 figures, 3 tables. arXiv admin note: text overlap with arXiv:1702.02098, arXiv:1904.10503 by other authors

Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.

視覺問答 · 自動問答 · MoDELS · 可辨認的 · 注意力機制 ·

2018 年 2 月 15 日

Learning to Count Objects in Natural Images for Visual Question Answering

Yan Zhang,Jonathon Hare,Adam Prügel-Bennett

from arxiv, Published in ICLR 2018

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.

塑造 · 可辨認的 · Better · 目標檢測 · state-of-the-art ·

2018 年 1 月 10 日

From Superpixel to Human Shape Modelling for Carried Object Detection

Farnoosh Ghadiri,Robert Bergevin,Guillaume-Alexandre Bilodeau

Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

序列到序列學習

Processing（編程語言）

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='yvHCe'><strong id='iOoYO'></strong><small id='gRRDU'></small><button id='CWx3G'></button><li id='TuEwX'><noscript id='GNsrO'><big id='qNzrB'></big><dt id='EdQBt'></dt></noscript></li></tr><ol id='Ydw3P'><option id='F3Bik'><table id='EOKxS'><blockquote id='d1lPI'><tbody id='Ev7ky'></tbody></blockquote></table></option></ol><u id='DsAcq'></u><kbd id='zg7JS'><kbd id='P8MQr'></kbd></kbd>

<code id='Qi7pW'><strong id='GzTYD'></strong></code>

<fieldset id='HWVjW'></fieldset>

<span id='jenjM'></span>

<ins id='4IsWu'></ins>

<acronym id='m2A7f'><em id='znxIO'></em><td id='Nuajg'><div id='uiieL'></div></td></acronym><address id='HUxbS'><big id='CfWqf'><big id='bJyFM'></big><legend id='DGnJ8'></legend></big></address>

<i id='JTH65'><div id='UuVai'><ins id='zehUR'></ins></div></i>

<i id='n9fZS'></i>