亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tr id='qklc4'><strong id='qklc4'></strong><small id='qklc4'></small><button id='qklc4'></button><li id='qklc4'><noscript id='qklc4'><big id='qklc4'></big><dt id='qklc4'></dt></noscript></li></tr><ol id='qklc4'><option id='qklc4'><table id='qklc4'><blockquote id='qklc4'><tbody id='qklc4'></tbody></blockquote></table></option></ol><u id='qklc4'></u><kbd id='qklc4'><kbd id='qklc4'></kbd></kbd>

<code id='qklc4'><strong id='qklc4'></strong></code>

<fieldset id='qklc4'></fieldset>

<span id='qklc4'></span>

<ins id='qklc4'></ins>

<acronym id='qklc4'><em id='qklc4'></em><td id='qklc4'><div id='qklc4'></div></td></acronym><address id='qklc4'><big id='qklc4'><big id='qklc4'></big><legend id='qklc4'></legend></big></address>

<i id='qklc4'><div id='qklc4'><ins id='qklc4'></ins></div></i>

<i id='qklc4'></i>

·

Performer · 模態 · 可約的 · Extensibility · 類別 ·

2023 年 12 月 6 日

ShareCMP: Polarization-Aware RGB-P Semantic Segmentation

Zhuoyan Liu,Bo Wang,Lizhi Wang,Chenyu Mao,Ye Li

from arxiv, 10 pages, 5 figures

Multimodal semantic segmentation is developing rapidly, but the modality of RGB-Polarization remains underexplored. To delve into this problem, we construct a UPLight RGB-P segmentation benchmark with 12 typical underwater semantic classes which provides data support for Autonomous Underwater Vehicles (AUVs) to perform special perception tasks. In this work, we design the ShareCMP, an RGB-P semantic segmentation framework with a shared dual-branch architecture, which reduces the number of parameters by about 26-33% compared to previous dual-branch models. It encompasses a Polarization Generate Attention (PGA) module designed to generate polarization modal images with richer polarization properties for the encoder. In addition, we introduce the Class Polarization-Aware Loss (CPALoss) to improve the learning and understanding of the encoder for polarization modal information and to optimize the PGA module. With extensive experiments on a total of three RGB-P benchmarks, our ShareCMP achieves state-of-the-art performance in mIoU with fewer parameters on the UPLight (92.45%), ZJU (92.7%), and MCubeS (50.99%) datasets. The code is available at //github.com/LEFTeyex/ShareCMP.

相關內容

Performer

容差 · 區塊鏈 · 可約的 · CASES · 回合 ·

2024 年 1 月 29 日

VBFT: Veloce Byzantine Fault Tolerant Consensus for Blockchains

Mohammad M. Jalalzai,Chen Feng,Victoria Lemieux

from arxiv, arXiv admin note: substantial text overlap with arXiv:2109.14604

Low latency is one of the most desirable features of partially synchronous Byzantine consensus protocols. Existing low-latency protocols have achieved consensus with just two communication steps by reducing the maximum number of faults the protocol can tolerate (from $f = \frac{n-1}{3}$ to $f = \frac{n+1}{5}$), \textcolor{black}{by relaxing protocol safety guarantees}, or by using trusted hardware like Trusted Execution Environment. Furthermore, these two-step protocols don't support rotating leaders and low-cost view change (leader replacement), which are important features of many blockchain use cases. In this paper, we propose a protocol called VBFT which achieves consensus in just two communication steps without sacrificing desirable features. In particular, VBFT tolerates $f = \frac{n-1}{3}$ faults (which is the best possible), guarantees strong safety for honest leaders, and requires no trusted hardware. Moreover, VBFT supports leader rotation and low-cost view change, thereby improving prior art on multiple axes.

Vision · Performance · Backbone · MoDELS · 推斷 ·

2024 年 1 月 26 日

ViR: Towards Efficient Vision Retention Backbones

Ali Hatamizadeh,Michael Ranzinger,Shiyi Lan,Jose M. Alvarez,Sanja Fidler,Jan Kautz

from arxiv, Introduction of Vision Retention Networks (ViR) for Efficient Visual Modeling

Vision Transformers (ViTs) have attracted a lot of popularity in recent years, due to their exceptional capabilities in modeling long-range spatial dependencies and scalability for large scale training. Although the training parallelism of self-attention mechanism plays an important role in retaining great performance, its quadratic complexity baffles the application of ViTs in many scenarios which demand fast inference. This effect is even more pronounced in applications in which autoregressive modeling of input features is required. In Natural Language Processing (NLP), a new stream of efforts has proposed parallelizable models with recurrent formulation that allows for efficient inference in generative applications. Inspired by this trend, we propose a new class of computer vision models, dubbed Vision Retention Networks (ViR), with dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance. In particular, ViR scales favorably for image throughput and memory consumption in tasks that require higher-resolution images due to its flexible formulation in processing large sequence lengths. The ViR is the first attempt to realize dual parallel and recurrent equivalency in a general vision backbone for recognition tasks. We have validated the effectiveness of ViR through extensive experiments with different dataset sizes and various image resolutions and achieved competitive performance. Code: //github.com/NVlabs/ViR

規范化的 · 離散化 · 級聯 · AIM · 編碼器-解碼器（模型） ·

2024 年 1 月 26 日

UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization

Yuejiao Wang,Xixin Wu,Disong Wang,Lingwei Meng,Helen Meng

from arxiv, Accepted to ICASSP 2024

Dysarthric speech reconstruction (DSR) systems aim to automatically convert dysarthric speech into normal-sounding speech. The technology eases communication with speakers affected by the neuromotor disorder and enhances their social inclusion. NED-based (Neural Encoder-Decoder) systems have significantly improved the intelligibility of the reconstructed speech as compared with GAN-based (Generative Adversarial Network) approaches, but the approach is still limited by training inefficiency caused by the cascaded pipeline and auxiliary tasks of the content encoder, which may in turn affect the quality of reconstruction. Inspired by self-supervised speech representation learning and discrete speech units, we propose a Unit-DSR system, which harnesses the powerful domain-adaptation capacity of HuBERT for training efficiency improvement and utilizes speech units to constrain the dysarthric content restoration in a discrete linguistic space. Compared with NED approaches, the Unit-DSR system only consists of a speech unit normalizer and a Unit HiFi-GAN vocoder, which is considerably simpler without cascaded sub-modules or auxiliary tasks. Results on the UASpeech corpus indicate that Unit-DSR outperforms competitive baselines in terms of content restoration, reaching a 28.2% relative average word error rate reduction when compared to original dysarthric speech, and shows robustness against speed perturbation and noise.

圖 · 圖形處理器 · 標量 · Networking · Neural Networks ·

2024 年 1 月 26 日

GOAt: Explaining Graph Neural Networks via Graph Output Attribution

Shengyao Lu,Keith G. Mills,Jiao He,Bang Liu,Di Niu

from arxiv, ICLR 2024 Poster

Understanding the decision-making process of Graph Neural Networks (GNNs) is crucial to their interpretability. Most existing methods for explaining GNNs typically rely on training auxiliary models, resulting in the explanations remain black-boxed. This paper introduces Graph Output Attribution (GOAt), a novel method to attribute graph outputs to input graph features, creating GNN explanations that are faithful, discriminative, as well as stable across similar samples. By expanding the GNN as a sum of scalar products involving node features, edge features and activation patterns, we propose an efficient analytical method to compute contribution of each node or edge feature to each scalar product and aggregate the contributions from all scalar products in the expansion form to derive the importance of each node and edge. Through extensive experiments on synthetic and real-world data, we show that our method not only outperforms various state-ofthe-art GNN explainers in terms of the commonly used fidelity metric, but also exhibits stronger discriminability, and stability by a remarkable margin.

Networking · Elevate · INFORMS · Performer · Integration ·

2024 年 1 月 25 日

POUR-Net: A Population-Prior-Aided Over-Under-Representation Network for Low-Count PET Attenuation Map Generation

Bo Zhou,Jun Hou,Tianqi Chen,Yinchi Zhou,Xiongchao Chen,Huidong Xie,Qiong Liu,Xueqi Guo,Yu-Jung Tsai,Vladimir Y. Panin,Takuya Toyonaga,James S. Duncan,Chi Liu

from arxiv, 10 pages, 5 figures

Low-dose PET offers a valuable means of minimizing radiation exposure in PET imaging. However, the prevalent practice of employing additional CT scans for generating attenuation maps (u-map) for PET attenuation correction significantly elevates radiation doses. To address this concern and further mitigate radiation exposure in low-dose PET exams, we propose POUR-Net - an innovative population-prior-aided over-under-representation network that aims for high-quality attenuation map generation from low-dose PET. First, POUR-Net incorporates an over-under-representation network (OUR-Net) to facilitate efficient feature extraction, encompassing both low-resolution abstracted and fine-detail features, for assisting deep generation on the full-resolution level. Second, complementing OUR-Net, a population prior generation machine (PPGM) utilizing a comprehensive CT-derived u-map dataset, provides additional prior information to aid OUR-Net generation. The integration of OUR-Net and PPGM within a cascade framework enables iterative refinement of $\mu$-map generation, resulting in the production of high-quality $\mu$-maps. Experimental results underscore the effectiveness of POUR-Net, showing it as a promising solution for accurate CT-free low-count PET attenuation correction, which also surpasses the performance of previous baseline methods.

Performer · 無監督 · 周期的 · 可約的 · Continuity ·

2024 年 1 月 25 日

McUDI: Model-Centric Unsupervised Degradation Indicator for Failure Prediction AIOps Solutions

Lorena Poenaru-Olaru,Luis Cruz,Jan Rellermeyer,Arie van Deursen

Due to the continuous change in operational data, AIOps solutions suffer from performance degradation over time. Although periodic retraining is the state-of-the-art technique to preserve the failure prediction AIOps models' performance over time, this technique requires a considerable amount of labeled data to retrain. In AIOps obtaining label data is expensive since it requires the availability of domain experts to intensively annotate it. In this paper, we present McUDI, a model-centric unsupervised degradation indicator that is capable of detecting the exact moment the AIOps model requires retraining as a result of changes in data. We further show how employing McUDI in the maintenance pipeline of AIOps solutions can reduce the number of samples that require annotations with 30k for job failure prediction and 260k for disk failure prediction while achieving similar performance with periodic retraining.

多峰值 · 掩碼 · 掩碼語言模型化 · Extensibility · MoDELS ·

2019 年 9 月 25 日

UNITER: Learning UNiversal Image-TExt Representations

Yen-Chun Chen,Linjie Li,Licheng Yu,Ahmed El Kholy,Faisal Ahmed,Zhe Gan,Yu Cheng,Jingjing Liu

Joint image-text embedding is the bedrock for most Vision-and-Language (V+L) tasks, where multimodality inputs are jointly processed for visual and textual understanding. In this paper, we introduce UNITER, a UNiversal Image-TExt Representation, learned through large-scale pre-training over four image-text datasets (COCO, Visual Genome, Conceptual Captions, and SBU Captions), which can power heterogeneous downstream V+L tasks with joint multimodal embeddings. We design three pre-training tasks: Masked Language Modeling (MLM), Image-Text Matching (ITM), and Masked Region Modeling (MRM, with three variants). Different from concurrent work on multimodal pre-training that apply joint random masking to both modalities, we use conditioned masking on pre-training tasks (i.e., masked language/region modeling is conditioned on full observation of image/text). Comprehensive analysis shows that conditioned masking yields better performance than unconditioned masking. We also conduct a thorough ablation study to find an optimal setting for the combination of pre-training tasks. Extensive experiments show that UNITER achieves new state of the art across six V+L tasks (over nine datasets), including Visual Question Answering, Image-Text Retrieval, Referring Expression Comprehension, Visual Commonsense Reasoning, Visual Entailment, and NLVR2.

學成 · 深度學習 · 可辨認的 · MoDELS · 目標跟蹤 ·

2019 年 7 月 31 日

Deep Learning in Video Multi-Object Tracking: A Survey

Gioele Ciaparrone,Francisco Luque Sánchez,Siham Tabik,Luigi Troiano,Roberto Tagliaferri,Francisco Herrera

from arxiv, New in v2: corrected typos and various minor mistakes. Submitted to Neurocomputing. Main text: 25 pages, 5 figures, 6 tables. Summary table in appendix at the end of the paper

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.

entity · 命名實體識別 · Networking · 卷積 · Extensibility ·

2019 年 4 月 3 日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Yuying Zhu,Guoxin Wang,B?rje F. Karlsson

Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually considered as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) words. In this paper, we investigate a Convolutional Attention Network called CAN for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Also, compared to other models, not depending on any external resources like lexicons and employing small size of char embeddings make our model more practical. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domain datasets including Weibo, MSRA and Chinese Resume NER dataset.

Compositional GAN · INTERACT · MoDELS · 學成 · entity ·

2018 年 7 月 19 日

Compositional GAN: Learning Conditional Image Composition

Samaneh Azadi,Deepak Pathak,Sayna Ebrahimi,Trevor Darrell

Generative Adversarial Networks (GANs) can produce images of surprising complexity and realism, but are generally modeled to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene. Capturing such complex interactions between different objects in the world, including their relative scaling, spatial layout, occlusion, or viewpoint transformation is a challenging problem. In this work, we propose to model object composition in a GAN framework as a self-consistent composition-decomposition network. Our model is conditioned on the object images from their marginal distributions to generate a realistic image from their joint distribution by explicitly learning the possible interactions. We evaluate our model through qualitative experiments and user evaluations in both the scenarios when either paired or unpaired examples for the individual object images and the joint scenes are given during training. Our results reveal that the learned model captures potential interactions between the two object domains given as input to output new instances of composed scene at test time in a reasonable fashion.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='QwzXZ'><strong id='7HfoL'></strong><small id='veHcJ'></small><button id='9wbzx'></button><li id='QWNrF'><noscript id='rlydy'><big id='0ExlQ'></big><dt id='tOU70'></dt></noscript></li></tr><ol id='RSwlP'><option id='pRSM1'><table id='HQ1rf'><blockquote id='5ShrH'><tbody id='Uic1M'></tbody></blockquote></table></option></ol><u id='1Qwwp'></u><kbd id='tLfAQ'><kbd id='9ABMq'></kbd></kbd>

<code id='9osn1'><strong id='JCcSY'></strong></code>

<fieldset id='YtMKs'></fieldset>

<span id='3ENII'></span>

<ins id='FYmk0'></ins>

<acronym id='dkl0T'><em id='9qa4s'></em><td id='yy3bm'><div id='NXqqy'></div></td></acronym><address id='36oiJ'><big id='7kjK5'><big id='ruscp'></big><legend id='tyXdM'></legend></big></address>

<i id='C2xng'><div id='OO6vl'><ins id='DgkMm'></ins></div></i>

<i id='QIwKw'></i>