一区二区三区四区五区无码,国产精品午夜无码AV天美,亚洲一区二区三区亚洲中久无码,日本欧美一区二区黄色片

Shawn Xu,Lin Yang,Christopher Kelly,Marcin Sieniek,Timo Kohlberger,Martin Ma,Wei-Hung Weng,Attila Kiraly,Sahar Kazemzadeh,Zakkai Melamed,Jungyeon Park,Patricia Strachan,Yun Liu,Chuck Lau,Preeti Singh,Christina Chen,Mozziyar Etemadi,Sreenivasa Raju Kalidindi,Yossi Matias,Katherine Chou,Greg S. Corrado,Shravya Shetty,Daniel Tse,Shruthi Prabhakara,Daniel Golden,Rory Pilgrim,Krish Eswaran,Andrew Sellergren

Our approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR achieved state-of-the-art performance on zero-shot chest X-ray (CXR) classification (mean AUC of 0.850 across 13 findings), data-efficient CXR classification (mean AUCs of 0.893 and 0.898 across five findings (atelectasis, cardiomegaly, consolidation, pleural effusion, and pulmonary edema) for 1% (~2,200 images) and 10% (~22,000 images) training data), and semantic search (0.76 normalized discounted cumulative gain (NDCG) across nineteen queries, including perfect retrieval on twelve of them). Compared to existing data-efficient methods including supervised contrastive learning (SupCon), ELIXR required two orders of magnitude less data to reach similar performance. ELIXR also showed promise on CXR vision-language tasks, demonstrating overall accuracies of 58.7% and 62.5% on visual question answering and report quality assurance tasks, respectively. These results suggest that ELIXR is a robust and versatile approach to CXR AI.

相關內容

Performer

關注 10

MoDELS · 泛函 · 可理解性 · Better · 有向 ·

2023 年 9 月 22 日

Point spread function modelling for astronomical telescopes: a review focused on weak gravitational lensing studies

Tobias Liaudat,Jean-Luc Starck,Martin Kilbinger

from arxiv, 64 pages, 14 figures. Accepted to Frontiers in Astronomy and Space Sciences

The accurate modelling of the Point Spread Function (PSF) is of paramount importance in astronomical observations, as it allows for the correction of distortions and blurring caused by the telescope and atmosphere. PSF modelling is crucial for accurately measuring celestial objects' properties. The last decades brought us a steady increase in the power and complexity of astronomical telescopes and instruments. Upcoming galaxy surveys like Euclid and LSST will observe an unprecedented amount and quality of data. Modelling the PSF for these new facilities and surveys requires novel modelling techniques that can cope with the ever-tightening error requirements. The purpose of this review is three-fold. First, we introduce the optical background required for a more physically-motivated PSF modelling and propose an observational model that can be reused for future developments. Second, we provide an overview of the different physical contributors of the PSF, including the optic- and detector-level contributors and the atmosphere. We expect that the overview will help better understand the modelled effects. Third, we discuss the different methods for PSF modelling from the parametric and non-parametric families for ground- and space-based telescopes, with their advantages and limitations. Validation methods for PSF models are then addressed, with several metrics related to weak lensing studies discussed in detail. Finally, we explore current challenges and future directions in PSF modelling for astronomical telescopes.

語義相似度 · 相似度 · 相似度度量 · Better · MoDELS ·

2023 年 9 月 22 日

Steffen Herbold

from arxiv, Under review

Semantic similarity between natural language texts is typically measured either by looking at the overlap between subsequences (e.g., BLEU) or by using embeddings (e.g., BERTScore, S-BERT). Within this paper, we argue that when we are only interested in measuring the semantic similarity, it is better to directly predict the similarity using a fine-tuned model for such a task. Using a fine-tuned model for the STS-B from the GLUE benchmark, we define the STSScore approach and show that the resulting similarity is better aligned with our expectations on a robust semantic similarity measure than other approaches.

可辨認的 · Learning · Machine Learning · MoDELS · Performer ·

2023 年 9 月 21 日

Identification of pneumonia on chest x-ray images through machine learning

Eduardo Augusto Roeder

from arxiv, In Brazilian Portuguese, 30 pages, 16 figures. This thesis was elaborated by the guidance of Prof. Dr. Akihito Inca Atahualpa Urdiales

Pneumonia is the leading infectious cause of infant death in the world. When identified early, it is possible to alter the prognosis of the patient, one could use imaging exams to help in the diagnostic confirmation. Performing and interpreting the exams as soon as possible is vital for a good treatment, with the most common exam for this pathology being chest X-ray. The objective of this study was to develop a software that identify the presence or absence of pneumonia in chest radiographs. The software was developed as a computational model based on machine learning using transfer learning technique. For the training process, images were collected from a database available online with children's chest X-rays images taken at a hospital in China. After training, the model was then exposed to new images, achieving relevant results on identifying such pathology, reaching 98% sensitivity and 97.3% specificity for the sample used for testing. It can be concluded that it is possible to develop a software that identifies pneumonia in chest X-ray images.

邊界框 · Performer · 標注 · 數據集 · Vision ·

2023 年 9 月 21 日

NeuralLabeling: A versatile toolset for labeling vision datasets using Neural Radiance Fields

Floris Erich,Naoya Chiba,Yusuke Yoshiyasu,Noriaki Ando,Ryo Hanai,Yukiyasu Domae

from arxiv, 8 pages, project website: //florise.github.io/neural_labeling_web/

We present NeuralLabeling, a labeling approach and toolset for annotating a scene using either bounding boxes or meshes and generating segmentation masks, affordance maps, 2D bounding boxes, 3D bounding boxes, 6DOF object poses, depth maps and object meshes. NeuralLabeling uses Neural Radiance Fields (NeRF) as renderer, allowing labeling to be performed using 3D spatial tools while incorporating geometric clues such as occlusions, relying only on images captured from multiple viewpoints as input. To demonstrate the applicability of NeuralLabeling to a practical problem in robotics, we added ground truth depth maps to 30000 frames of transparent object RGB and noisy depth maps of glasses placed in a dishwasher captured using an RGBD sensor, yielding the Dishwasher30k dataset. We show that training a simple deep neural network with supervision using the annotated depth maps yields a higher reconstruction performance than training with the previously applied weakly supervised approach.

MoDELS · Performer · 語言模型化 · 話題 · 數據集 ·

2023 年 9 月 20 日

MasakhaNEWS: News Topic Classification for African languages

David Ifeoluwa Adelani,Marek Masiak,Israel Abebe Azime,Jesujoba Alabi,Atnafu Lambebo Tonja,Christine Mwase,Odunayo Ogundepo,Bonaventure F. P. Dossou,Akintunde Oladipo,Doreen Nixdorf,Chris Chinenye Emezue,sana al-azzawi,Blessing Sibanda,Davis David,Lolwethu Ndolela,Jonathan Mukiibi,Tunde Ajayi,Tatiana Moteu,Brian Odhiambo,Abraham Owodunni,Nnaemeka Obiefuna,Muhidin Mohamed,Shamsuddeen Hassan Muhammad,Teshome Mulugeta Ababu,Saheed Abdullahi Salahudeen,Mesay Gemeda Yigezu,Tajuddeen Gwadabe,Idris Abdulmumin,Mahlet Taye,Oluwabusayo Awoyomi,Iyanuoluwa Shode,Tolulope Adelani,Habiba Abdulganiyu,Abdul-Hakeem Omotayo,Adetola Adeeko,Abeeb Afolabi,Anuoluwapo Aremu,Olanrewaju Samuel,Clemencia Siro,Wangari Kimotho,Onyekachi Ogbu,Chinedu Mbonu,Chiamaka Chukwuneke,Samuel Fanijo,Jessica Ojo,Oyinkansola Awosan,Tadesse Kebede,Toadoum Sari Sakayo,Pamela Nyatsine,Freedmore Sidume,Oreen Yousuf,Mardiyyah Oduwole,Tshinu Tshinu,Ussen Kimanuka,Thina Diko,Siyanda Nxakama,Sinodos Nigusse,Abdulmejid Johar,Shafie Mohamed,Fuad Mire Hassan,Moges Ahmed Mehamed,Evrard Ngabire,Jules Jules,Ivan Ssenkungu,Pontus Stenetorp

from arxiv, Accepted to IJCNLP-AACL 2023 (main conference)

African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named entity recognition and machine translation) have standardized benchmark datasets covering several geographical and typologically-diverse African languages. In this paper, we develop MasakhaNEWS -- a new benchmark dataset for news topic classification covering 16 languages widely spoken in Africa. We provide an evaluation of baseline models by training classical machine learning models and fine-tuning several language models. Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API). Our evaluation in zero-shot setting shows the potential of prompting ChatGPT for news topic classification in low-resource African languages, achieving an average performance of 70 F1 points without leveraging additional supervision like MAD-X. In few-shot setting, we show that with as little as 10 examples per label, we achieved more than 90\% (i.e. 86.0 F1 points) of the performance of full supervised training (92.6 F1 points) leveraging the PET approach.

機器人 · 回合 · 準則 · 控制器 · Robot ·

2023 年 9 月 20 日

Open-endedness induced through a predator-prey scenario using modular robots

Dimitri Kachler,Karine Miras

This work investigates how a predator-prey scenario can induce the emergence of Open-Ended Evolution (OEE). We utilize modular robots of fixed morphologies whose controllers are subject to evolution. In both species, robots can send and receive signals and perceive the relative positions of other robots in the environment. Specifically, we introduce a feature we call a tagging system: it modifies how individuals can perceive each other and is expected to increase behavioral complexity. Our results show the emergence of adaptive strategies, demonstrating the viability of inducing OEE through predator-prey dynamics using modular robots. Such emergence, nevertheless, seemed to depend on conditioning reproduction to an explicit behavioral criterion.

MoDELS · Nuance · 有偏 · contrastive · Vision ·

2023 年 9 月 20 日

The Scenario Refiner: Grounding subjects in images at the morphological level

Claudia Tagliaferri,Sofia Axioti,Albert Gatt,Denis Paperno

from arxiv, presented at the LIMO workshop (Linguistic Insights from and for Multimodal Language Processing @KONVENS 2023)

Derivationally related words, such as "runner" and "running", exhibit semantic differences which also elicit different visual scenarios. In this paper, we ask whether Vision and Language (V\&L) models capture such distinctions at the morphological level, using a a new methodology and dataset. We compare the results from V\&L models to human judgements and find that models' predictions differ from those of human participants, in particular displaying a grammatical bias. We further investigate whether the human-model misalignment is related to model architecture. Our methodology, developed on one specific morphological contrast, can be further extended for testing models on capturing other nuanced language features.

Mixup · 數據增強 · BERT · Attention · 混合 ·

2023 年 9 月 20 日

AttentionMix: Data augmentation method that relies on BERT attention mechanism

Dominik Lewy,Jacek Mańdziuk

The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still room for new, improved approaches. In this work, we introduce AttentionMix, a novel mixing method that relies on attention-based information. While the paper focuses on the BERT attention mechanism, the proposed approach can be applied to generally any attention-based model. AttentionMix is evaluated on 3 standard sentiment classification datasets and in all three cases outperforms two benchmark approaches that utilize Mixup mechanism, as well as the vanilla BERT method. The results confirm that the attention-based information can be effectively used for data augmentation in the NLP domain.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

BERT · 語言表示 · state-of-the-art · 可理解性 · 自動問答 ·

2018 年 10 月 11 日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova

from arxiv, 13 pages

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.