蜜芽亚洲精品国产品国语在线试看,欧美精品A在线观看

We study few-shot acoustic event detection (AED) in this paper. Few-shot learning enables detection of new events with very limited labeled data. Compared to other research areas like computer vision, few-shot learning for audio recognition has been under-studied. We formulate few-shot AED problem and explore different ways of utilizing traditional supervised methods for this setting as well as a variety of meta-learning approaches, which are conventionally used to solve few-shot classification problem. Compared to supervised baselines, meta-learning models achieve superior performance, thus showing its effectiveness on generalization to new audio events. Our analysis including impact of initialization and domain discrepancy further validate the advantage of meta-learning approaches in few-shot AED.

相關內容

小樣本學習

關注 0

小樣本學習（Few-Shot Learning，以下簡稱 FSL ）用于解決當可用的數據量比較少時，如何提升神經網絡的性能。在 FSL 中，經常用到的一類方法被稱為 Meta-learning。和普通的神經網絡的訓練方法一樣，Meta-learning 也包含訓練過程和測試過程，但是它的訓練過程被稱作 Meta-training 和 Meta-testing。

Extensibility · 學成 · SSL · 目標檢測 · 3D ·

2020 年 3 月 20 日

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

Dong Wang,Yuan Zhang,Kexin Zhang,Liwei Wang

from arxiv, Accepted by CVPR 2020

Applying artificial intelligence techniques in medical imaging is one of the most promising areas in medicine. However, most of the recent success in this area highly relies on large amounts of carefully annotated data, whereas annotating medical images is a costly process. In this paper, we propose a novel method, called FocalMix, which, to the best of our knowledge, is the first to leverage recent advances in semi-supervised learning (SSL) for 3D medical image detection. We conducted extensive experiments on two widely used datasets for lung nodule detection, LUNA16 and NLST. Results show that our proposed SSL methods can achieve a substantial improvement of up to 17.3% over state-of-the-art supervised learning approaches with 400 unlabeled CT scans.

prototype · Networking · 小樣本學習 · Better · 穩健性 ·

2019 年 10 月 25 日

Meta-Learning with Dynamic-Memory-Based Prototypical Network for Few-Shot Event Detection

Shumin Deng,Ningyu Zhang,Jiaojian Kang,Yichi Zhang,Wei Zhang,Huajun Chen

from arxiv, Accepted by WSDM 2020

Event detection (ED), a sub-task of event extraction, involves identifying triggers and categorizing event mentions. Existing methods primarily rely upon supervised learning and require large-scale labeled event datasets which are unfortunately not readily available in many real-life applications. In this paper, we consider and reformulate the ED task with limited labeled data as a Few-Shot Learning problem. We propose a Dynamic-Memory-Based Prototypical Network (DMB-PN), which exploits Dynamic Memory Network (DMN) to not only learn better prototypes for event types, but also produce more robust sentence encodings for event mentions. Differing from vanilla prototypical networks simply computing event prototypes by averaging, which only consume event mentions once, our model is more robust and is capable of distilling contextual information from event mentions for multiple times due to the multi-hop mechanism of DMNs. The experiments show that DMB-PN not only deals with sample scarcity better than a series of baseline models but also performs more robustly when the variety of event types is relatively large and the instance quantity is extremely small.

學成 · 深度學習 · Vision · 計算機視覺 · Performance ·

2019 年 9 月 5 日

Scene Text Detection and Recognition: The Deep Learning Era

Shangbang Long,Xin He,Cong Yao

from arxiv, Submitted version

With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, approach and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight recent techniques and benchmarks; (3) look ahead into future trends. Specifically, we will emphasize the dramatic differences brought by deep learning and the grand challenges still remained. We expect that this review paper would serve as a reference book for researchers in this field. Related resources are also collected and compiled in our Github repository: //github.com/Jyouhou/SceneTextPapers.

小樣本學習 · MoDELS · 基 · 目標檢測 · FAST ·

2018 年 12 月 5 日

Few-shot Object Detection via Feature Reweighting

Bingyi Kang,Zhuang Liu,Xin Wang,Fisher Yu,Jiashi Feng,Trevor Darrell

This work aims to solve the challenging few-shot object detection problem where only a few annotated examples are available for each object category to train a detection model. Such an ability of learning to detect an object from just a few examples is common for human vision systems, but remains absent for computer vision systems. Though few-shot meta learning offers a promising solution technique, previous works mostly target the task of image classification and are not directly applicable for the much more complicated object detection task. In this work, we propose a novel meta-learning based model with carefully designed architecture, which consists of a meta-model and a base detection model. The base detection model is trained on several base classes with sufficient samples to offer basis features. The meta-model is trained to reweight importance of features from the base detection model over the input image and adapt these features to assist novel object detection from a few examples. The meta-model is light-weight, end-to-end trainable and able to entail the base model with detection ability for novel objects fast. Through experiments we demonstrated our model can outperform baselines by a large margin for few-shot object detection, on multiple datasets and settings. Our model also exhibits fast adaptation speed to novel few-shot classes.

目標檢測 · state-of-the-art · MoDELS · Integration · Backbone ·

2018 年 7 月 27 日

Video Object Detection with an Aligned Spatial-Temporal Memory

Fanyi Xiao,Yong Jae Lee

We introduce Spatial-Temporal Memory Networks for video object detection. At its core, a novel Spatial-Temporal Memory module (STMM) serves as the recurrent computation unit to model long-term temporal appearance and motion dynamics. The STMM's design enables full integration of pretrained backbone CNN weights, which we find to be critical for accurate detection. Furthermore, in order to tackle object motion in videos, we propose a novel MatchTrans module to align the spatial-temporal memory from frame to frame. Our method produces state-of-the-art results on the benchmark ImageNet VID dataset, and our ablative studies clearly demonstrate the contribution of our different design choices. We release our code and models at //fanyix.cs.ucdavis.edu/project/stmn/project.html.

Performer · 無監督 · 學成 · 強化學習 · Automator ·

2018 年 6 月 12 日

Unsupervised Meta-Learning for Reinforcement Learning

Abhishek Gupta,Benjamin Eysenbach,Chelsea Finn,Sergey Levine

Meta-learning is a powerful tool that builds on multi-task learning to learn how to quickly adapt a model to new tasks. In the context of reinforcement learning, meta-learning algorithms can acquire reinforcement learning procedures to solve new problems more efficiently by meta-learning prior tasks. The performance of meta-learning algorithms critically depends on the tasks available for meta-training: in the same way that supervised learning algorithms generalize best to test points drawn from the same distribution as the training points, meta-learning methods generalize best to tasks from the same distribution as the meta-training tasks. In effect, meta-reinforcement learning offloads the design burden from algorithm design to task design. If we can automate the process of task design as well, we can devise a meta-learning algorithm that is truly automated. In this work, we take a step in this direction, proposing a family of unsupervised meta-learning algorithms for reinforcement learning. We describe a general recipe for unsupervised meta-reinforcement learning, and describe an effective instantiation of this approach based on a recently proposed unsupervised exploration technique and model-agnostic meta-learning. We also discuss practical and conceptual considerations for developing unsupervised meta-learning methods. Our experimental results demonstrate that unsupervised meta-reinforcement learning effectively acquires accelerated reinforcement learning procedures without the need for manual task design, significantly exceeds the performance of learning from scratch, and even matches performance of meta-learning methods that use hand-specified task distributions.

Performer · Extensibility · 穩健性 · MoDELS · 學成 ·

2018 年 6 月 8 日

Adversarial Meta-Learning

Chengxiang Yin,Jian Tang,Zhiyuan Xu,Yanzhi Wang

from arxiv, 10 pages, this paper was submitted to NIPS 2018

Meta-learning enables a model to learn from very limited data to undertake a new task. In this paper, we study the general meta-learning with adversarial samples. We present a meta-learning algorithm, ADML (ADversarial Meta-Learner), which leverages clean and adversarial samples to optimize the initialization of a learning model in an adversarial manner. ADML leads to the following desirable properties: 1) it turns out to be very effective even in the cases with only clean samples; 2) it is model-agnostic, i.e., it is compatible with any learning model that can be trained with gradient descent; and most importantly, 3) it is robust to adversarial samples, i.e., unlike other meta-learning methods, it only leads to a minor performance degradation when there are adversarial samples. We show via extensive experiments that ADML delivers the state-of-the-art performance on two widely-used image datasets, MiniImageNet and CIFAR100, in terms of both accuracy and robustness.

Extensibility · 標記空間 · 目標檢測 · 類別 · Principle ·

2018 年 4 月 12 日

Zero-Shot Object Detection

Ankan Bansal,Karan Sikka,Gaurav Sharma,Rama Chellappa,Ajay Divakaran

from arxiv, 16 pages

We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories cf. prior works on zero-shot classification. We follow a principled approach by first adapting visual-semantic embeddings for ZSD. We then discuss the problems associated with selecting a background class and motivate two background-aware approaches for learning robust detectors. One of these models uses a fixed background class and the other is based on iterative latent assignments. We also outline the challenge associated with using a limited number of training classes and propose a solution based on dense sampling of the semantic label space using auxiliary data with a large number of categories. We propose novel splits of two standard detection datasets - MSCOCO and VisualGenome and discuss extensive empirical results to highlight the benefits of the proposed methods. We provide useful insights into the algorithm and conclude by posing some open questions to encourage further research.

INFORMS · Performer · 查全率/召回率 · 目標檢測 · 學成 ·

2018 年 3 月 19 日

Zero-Shot Detection

Pengkai Zhu,Hanxiao Wang,Tolga Bolukbasi,Venkatesh Saligrama

from arxiv, 16 pages, 5 figures

As we move towards large-scale object detection, it is unrealistic to expect annotated training data for all object classes at sufficient scale, and so methods capable of unseen object detection are required. We propose a novel zero-shot method based on training an end-to-end model that fuses semantic attribute prediction with visual features to propose object bounding boxes for seen and unseen classes. While we utilize semantic features during training, our method is agnostic to semantic information for unseen classes at test-time. Our method retains the efficiency and effectiveness of YOLO for objects seen during training, while improving its performance for novel and unseen objects. The ability of state-of-art detection methods to learn discriminative object features to reject background proposals also limits their performance for unseen objects. We posit that, to detect unseen objects, we must incorporate semantic information into the visual domain so that the learned visual features reflect this information and leads to improved recall rates for unseen objects. We test our method on PASCAL VOC and MS COCO dataset and observed significant improvements on the average precision of unseen classes.