销魂美女一区二区三区AV_91成人精品爽啪在线观看_6080免费三级片在线观看_欧美一区精品视频一二区99Y_日韩久久无码精品一区二_国产欧美日韩一区区三区精彩视频_亚洲精品少妇456在线观看

Glaucoma is one of the ophthalmic diseases that may cause blindness, for which early detection and treatment are very important. Fundus images and optical coherence tomography (OCT) images are both widely-used modalities in diagnosing glaucoma. However, existing glaucoma grading approaches mainly utilize a single modality, ignoring the complementary information between fundus and OCT. In this paper, we propose an efficient multi-modality supervised contrastive learning framework, named COROLLA, for glaucoma grading. Through layer segmentation as well as thickness calculation and projection, retinal thickness maps are extracted from the original OCT volumes and used as a replacing modality, resulting in more efficient calculations with less memory usage. Given the high structure and distribution similarities across medical image samples, we employ supervised contrastive learning to increase our models' discriminative power with better convergence. Moreover, feature-level fusion of paired fundus image and thickness map is conducted for enhanced diagnosis accuracy. On the GAMMA dataset, our COROLLA framework achieves overwhelming glaucoma grading performance compared to state-of-the-art methods.

相關內容

contrastive

關注 1

可辨認的 · 評論員 · 3D · 學成 · AUC ·

2022 年 4 月 20 日

Geometric Deep Learning to Identify the Critical 3D Structural Features of the Optic Nerve Head for Glaucoma Diagnosis

Fabian A. Braeu,Alexandre H. Thiéry,Tin A. Tun,Aiste Kadziauskiene,George Barbastathis,Tin Aung,Micha?l J. A. Girard

Purpose: The optic nerve head (ONH) undergoes complex and deep 3D morphological changes during the development and progression of glaucoma. Optical coherence tomography (OCT) is the current gold standard to visualize and quantify these changes, however the resulting 3D deep-tissue information has not yet been fully exploited for the diagnosis and prognosis of glaucoma. To this end, we aimed: (1) To compare the performance of two relatively recent geometric deep learning techniques in diagnosing glaucoma from a single OCT scan of the ONH; and (2) To identify the 3D structural features of the ONH that are critical for the diagnosis of glaucoma. Methods: In this study, we included a total of 2,247 non-glaucoma and 2,259 glaucoma scans from 1,725 subjects. All subjects had their ONHs imaged in 3D with Spectralis OCT. All OCT scans were automatically segmented using deep learning to identify major neural and connective tissues. Each ONH was then represented as a 3D point cloud. We used PointNet and dynamic graph convolutional neural network (DGCNN) to diagnose glaucoma from such 3D ONH point clouds and to identify the critical 3D structural features of the ONH for glaucoma diagnosis. Results: Both the DGCNN (AUC: 0.97$\pm$0.01) and PointNet (AUC: 0.95$\pm$0.02) were able to accurately detect glaucoma from 3D ONH point clouds. The critical points formed an hourglass pattern with most of them located in the inferior and superior quadrant of the ONH. Discussion: The diagnostic accuracy of both geometric deep learning approaches was excellent. Moreover, we were able to identify the critical 3D structural features of the ONH for glaucoma diagnosis that tremendously improved the transparency and interpretability of our method. Consequently, our approach may have strong potential to be used in clinical applications for the diagnosis and prognosis of a wide range of ophthalmic disorders.

Color · 模態 · Performer · SCAN · TOOLS ·

2022 年 4 月 19 日

GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges

Junde Wu,Huihui Fang,Fei Li,Huazhu Fu,Fengbin Lin,Jiongcheng Li,Lexing Huang,Qinji Yu,Sifan Song,Xinxing Xu,Yanyu Xu,Wensai Wang,Lingxiao Wang,Shuai Lu,Huiqi Li,Shihua Huang,Zhichao Lu,Chubin Ou,Xifei Wei,Bingyuan Liu,Riadh Kobbi,Xiaoying Tang,Li Lin,Qiang Zhou,Qiang Hu,Hrvoje Bogunovic,José Ignacio Orlando,Xiulan Zhang,Yanwu Xu

Color fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both two modalities of images have prominent biomarkers to indicate glaucoma suspected. Clinically, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment. Inspired by the success of Retinal Fundus Glaucoma Challenge (REFUGE) we held previously, we set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus \& OCT-based glaucoma grading. The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes. As part of GAMMA, we have publicly released a glaucoma annotated dataset with both 2D fundus color photography and 3D OCT volumes, which is the first multi-modality dataset for glaucoma grading. In addition, an evaluation framework is also established to evaluate the performance of the submitted methods. During the challenge, 1272 results were submitted, and finally, top-10 teams were selected to the final stage. We analysis their results and summarize their methods in the paper. Since all these teams submitted their source code in the challenge, a detailed ablation study is also conducted to verify the effectiveness of the particular modules proposed. We find many of the proposed techniques are practical for the clinical diagnosis of glaucoma. As the first in-depth study of fundus \& OCT multi-modality glaucoma grading, we believe the GAMMA Challenge will be an essential starting point for future research.

正則化項 · 廣義線性模型 · Pandas · 損失函數（機器學習） · 估計/估計量 ·

2022 年 4 月 18 日

Adaptive Noisy Data Augmentation for Regularized Estimation and Inference in Generalized Linear Models

Yinan Li,Fang Liu

from arxiv, Accepted by IEEE-COMPSAC 2022 Computers, Software & Applications in an Uncertain World

We propose the AdaPtive Noise Augmentation (PANDA) procedure to regularize the estimation and inference of generalized linear models (GLMs). PANDA iteratively optimizes the objective function given noise augmented data until convergence to obtain the regularized model estimates. The augmented noises are designed to achieve various regularization effects, including $l_0$, bridge (lasso and ridge included), elastic net, adaptive lasso, and SCAD, as well as group lasso and fused ridge. We examine the tail bound of the noise-augmented loss function and establish the almost sure convergence of the noise-augmented loss function and its minimizer to the expected penalized loss function and its minimizer, respectively. We derive the asymptotic distributions for the regularized parameters, based on which, inferences can be obtained simultaneously with variable selection. PANDA exhibits ensemble learning behaviors that help further decrease the generalization error. Computationally, PANDA is easy to code, leveraging existing software for implementing GLMs, without resorting to complicated optimization techniques. We demonstrate the superior or similar performance of PANDA against the existing approaches of the same type of regularizers in simulated and real-life data. We show that the inferences through PANDA achieve nominal or near-nominal coverage and are far more efficient compared to a popular existing post-selection procedure.

contrastive · 可理解性 · 對比學習 · Performer · 相似度 ·

2022 年 4 月 18 日

GL-CLeF: A Global-Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding

Libo Qin,Qiguang Chen,Tianbao Xie,Qixin Li,Jian-Guang Lou,Wanxiang Che,Min-Yen Kan

from arxiv, Accepted at ACL2022 Main Conference

Due to high data demands of current methods, attention to zero-shot cross-lingual spoken language understanding (SLU) has grown, as such approaches greatly reduce human annotation effort. However, existing models solely rely on shared parameters, which can only perform implicit alignment across languages. We present Global--Local Contrastive Learning Framework (GL-CLeF) to address this shortcoming. Specifically, we employ contrastive learning, leveraging bilingual dictionaries to construct multilingual views of the same utterance, then encourage their representations to be more similar than negative example pairs, which achieves to explicitly aligned representations of similar sentences across languages. In addition, a key step in GL-CLeF is a proposed Local and Global component, which achieves a fine-grained cross-lingual transfer (i.e., sentence-level Local intent transfer, token-level Local slot transfer, and semantic-level Global transfer across intent and slot). Experiments on MultiATIS++ show that GL-CLeF achieves the best performance and successfully pulls representations of similar sentences across languages closer.

自編碼器 · 掩碼 · 學成 · MoDELS · 講稿 ·

2022 年 4 月 18 日

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training

Hao Liu,Xinghua Jiang,Xin Li,Antai Guo,Deqiang Jiang,Bo Ren

from arxiv, Tech report

The self-supervised Masked Image Modeling (MIM) schema, following "mask-and-reconstruct" pipeline of recovering contents from masked image, has recently captured the increasing interest in the multimedia community, owing to the excellent ability of learning visual representation from unlabeled data. Aiming at learning representations with high semantics abstracted, a group of works attempts to reconstruct non-semantic pixels with large-ratio masking strategy, which may suffer from "over-smoothing" problem, while others directly infuse semantics into targets in off-line way requiring extra data. Different from them, we shift the perspective to the Fourier domain which naturally has global perspective and present a new Masked Image Modeling (MIM), termed Geminated Gestalt Autoencoder (Ge$^2$-AE) for visual pre-training. Specifically, we equip our model with geminated decoders in charge of reconstructing image contents from both pixel and frequency space, where each other serves as not only the complementation but also the reciprocal constraints. Through this way, more robust representations can be learned in the pre-trained encoders, of which the effectiveness is confirmed by the juxtaposing experimental results on downstream recognition tasks. We also conduct several quantitative and qualitative experiments to investigate the learning behavior of our method. To our best knowledge, this is the first MIM work to solve the visual pre-training through the lens of frequency domain.

contrastive · 對比學習 · 學成 · 圖片分類 · 監督 ·

2022 年 4 月 18 日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Zhibo Zhang,Jongseong Jang,Chiheb Trabelsi,Ruiwen Li,Scott Sanner,Yeonjeong Jeong,Dongsub Shim

Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification. However, a key drawback of existing contrastive augmentation methods is that they may lead to the modification of the image content which can yield undesired alterations of its semantics. This can affect the performance of the model on downstream tasks. Hence, in this paper, we ask whether we can augment image data in contrastive learning such that the task-relevant semantic content of an image is preserved. For this purpose, we propose to leverage saliency-based explanation methods to create content-preserving masked augmentations for contrastive learning. Our novel explanation-driven supervised contrastive learning (ExCon) methodology critically serves the dual goals of encouraging nearby image embeddings to have similar content and explanation. To quantify the impact of ExCon, we conduct experiments on the CIFAR-100 and the Tiny ImageNet datasets. We demonstrate that ExCon outperforms vanilla supervised contrastive learning in terms of classification, explanation quality, adversarial robustness as well as probabilistic calibration in the context of distributional shift.

Performer · INFORMS · 學成 · 邊緣化 · 試驗 ·

2022 年 4 月 17 日

An Adaptive Task-Related Component Analysis Method for SSVEP recognition

Vangelis P. Oikonomou

from arxiv, 23 pages, 3 Figures, 6 Tables

Steady-state visual evoked potential (SSVEP) recognition methods are equipped with learning from the subject's calibration data, and they can achieve extra high performance in the SSVEP-based brain-computer interfaces (BCIs), however their performance deteriorate drastically if the calibration trials are insufficient. This study develops a new method to learn from limited calibration data and it proposes and evaluates a novel adaptive data-driven spatial filtering approach for enhancing SSVEPs detection. The spatial filter learned from each stimulus utilizes temporal information from the corresponding EEG trials. To introduce the temporal information into the overall procedure, an multitask learning approach, based on the bayesian framework, is adopted. The performance of the proposed method was evaluated into two publicly available benchmark datasets, and the results demonstrated that our method outperform competing methods by a significant margin.

學成 · MoDELS · 層 · Networking · Backbone ·

2022 年 4 月 15 日

Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

Kazuki Omi,Toru Tamaki

from arxiv, 8 pages

In this paper, we propose a multi-domain learning model for action recognition. The proposed method inserts domain-specific adapters between layers of domain-independent layers of a backbone network. Unlike a multi-head network that switches classification heads only, our model switches not only the heads, but also the adapters for facilitating to learn feature representations universal to multiple domains. Unlike prior works, the proposed method is model-agnostic and doesn't assume model structures unlike prior works. Experimental results on three popular action recognition datasets (HMDB51, UCF101, and Kinetics-400) demonstrate that the proposed method is more effective than a multi-head architecture and more efficient than separately training models for each domain.

多峰值 · 模態 · INFORMS · MoDELS · 可約的 ·

2021 年 6 月 30 日

Attention Bottlenecks for Multimodal Fusion

Arsha Nagrani,Shan Yang,Anurag Arnab,Aren Jansen,Cordelia Schmid,Chen Sun

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.

entity · Performer · 命名實體識別 · state-of-the-art · 主動學習 ·

2018 年 2 月 4 日

Deep Active Learning for Named Entity Recognition

Yanyao Shen,Hyokun Yun,Zachary C. Lipton,Yakov Kronrod,Animashree Anandkumar

Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the amount of labeled training data can be drastically reduced when deep learning is combined with active learning. While active learning is sample-efficient, it can be computationally expensive since it requires iterative retraining. To speed this up, we introduce a lightweight architecture for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and word encoders and a long short term memory (LSTM) tag decoder. The model achieves nearly state-of-the-art performance on standard datasets for the task while being computationally much more efficient than best performing models. We carry out incremental active learning, during the training process, and are able to nearly match state-of-the-art performance with just 25\% of the original training data.