国产又色又爽又黄又免费软件,五月婷婷开心之中文字幕,免费观看国产线观看宅男,久九波多野结衣久久网

With development of deep learning, researchers have developed generative models in generating realistic images. One of such generative models, a PixelCNNs model with Vector Quantized Variational AutoEncoder 2 (VQ-VAE-2), can generate more various images than other models. However, a PixelCNNs model with VQ-VAE-2, I call it PC-VQ2, requires sufficiently much training data like other deep learning models. Its practical applications are often limited in domains where collecting sufficient data is not difficult. To solve the problem, researchers have recently proposed more data-efficient methods for training generative models with limited unlabeled data from scratch. However, no such methods in PC-VQ2s have been researched. This study provides the first step in this direction, considering generation of images using PC-VQ2s and limited unlabeled data. In this study, I propose a training strategy for training a PC-VQ2 with limited data from scratch, phased data augmentation. In the strategy, ranges of parameters of data augmentation is narrowed in phases through learning. Quantitative evaluation shows that the phased data augmentation enables the model with limited data to generate images competitive with the one with sufficient data in diversity and outperforming it in fidelity. The evaluation suggests that the proposed method should be useful for training a PC-VQ2 with limited data efficiently to generate various and natural images.

相關內容

VQ-VAE-2

關注 0

異常點 · 主動學習 · Learning · 模型評估 · 未標記 ·

2023 年 7 月 7 日

Training Ensembles with Inliers and Outliers for Semi-supervised Active Learning

Vladan Stojni?,Zakaria Laskar,Giorgos Tolias

Deep active learning in the presence of outlier examples poses a realistic yet challenging scenario. Acquiring unlabeled data for annotation requires a delicate balance between avoiding outliers to conserve the annotation budget and prioritizing useful inlier examples for effective training. In this work, we present an approach that leverages three highly synergistic components, which are identified as key ingredients: joint classifier training with inliers and outliers, semi-supervised learning through pseudo-labeling, and model ensembling. Our work demonstrates that ensembling significantly enhances the accuracy of pseudo-labeling and improves the quality of data acquisition. By enabling semi-supervision through the joint training process, where outliers are properly handled, we observe a substantial boost in classifier accuracy through the use of all available unlabeled examples. Notably, we reveal that the integration of joint training renders explicit outlier detection unnecessary; a conventional component for acquisition in prior work. The three key components align seamlessly with numerous existing approaches. Through empirical evaluations, we showcase that their combined use leads to a performance increase. Remarkably, despite its simplicity, our proposed approach outperforms all other methods in terms of performance. Code: //github.com/vladan-stojnic/active-outliers

Performer · SSL · Networking · Learning · CNN ·

2023 年 7 月 6 日

Self-supervised learning via inter-modal reconstruction and feature projection networks for label-efficient 3D-to-2D segmentation

José Morano,Guilherme Aresta,Dmitrii Lachinov,Julia Mai,Ursula Schmidt-Erfurth,Hrvoje Bogunovi?

from arxiv, To appear in MICCAI 2023. Code: //github.com/j-morano/multimodal-ssl-fpn

Deep learning has become a valuable tool for the automation of certain medical image segmentation tasks, significantly relieving the workload of medical specialists. Some of these tasks require segmentation to be performed on a subset of the input dimensions, the most common case being 3D-to-2D. However, the performance of existing methods is strongly conditioned by the amount of labeled data available, as there is currently no data efficient method, e.g. transfer learning, that has been validated on these tasks. In this work, we propose a novel convolutional neural network (CNN) and self-supervised learning (SSL) method for label-efficient 3D-to-2D segmentation. The CNN is composed of a 3D encoder and a 2D decoder connected by novel 3D-to-2D blocks. The SSL method consists of reconstructing image pairs of modalities with different dimensionality. The approach has been validated in two tasks with clinical relevance: the en-face segmentation of geographic atrophy and reticular pseudodrusen in optical coherence tomography. Results on different datasets demonstrate that the proposed CNN significantly improves the state of the art in scenarios with limited labeled data by up to 8% in Dice score. Moreover, the proposed SSL method allows further improvement of this performance by up to 23%, and we show that the SSL is beneficial regardless of the network architecture.

正則化項 · 邊 · Networking · 監督 · 聚類假設 ·

2023 年 7 月 5 日

STEdge: Self-training Edge Detection with Multi-layer Teaching and Regularization

Yunfan Ye,Renjiao Yi,Zhiping Cai,Kai Xu

Learning-based edge detection has hereunto been strongly supervised with pixel-wise annotations which are tedious to obtain manually. We study the problem of self-training edge detection, leveraging the untapped wealth of large-scale unlabeled image datasets. We design a self-supervised framework with multi-layer regularization and self-teaching. In particular, we impose a consistency regularization which enforces the outputs from each of the multiple layers to be consistent for the input image and its perturbed counterpart. We adopt L0-smoothing as the 'perturbation' to encourage edge prediction lying on salient boundaries following the cluster assumption in self-supervised learning. Meanwhile, the network is trained with multi-layer supervision by pseudo labels which are initialized with Canny edges and then iteratively refined by the network as the training proceeds. The regularization and self-teaching together attain a good balance of precision and recall, leading to a significant performance boost over supervised methods, with lightweight refinement on the target dataset. Furthermore, our method demonstrates strong cross-dataset generality. For example, it attains 4.8% improvement for ODS and 5.8% for OIS when tested on the unseen BIPED dataset, compared to the state-of-the-art methods.

模型選擇 · MoDELS · 圖片分類 · 估計/估計量 · 生成方法 ·

2023 年 7 月 5 日

Synthetic Data for Model Selection

Alon Shoshan,Nadav Bhonker,Igor Kviatkovsky,Matan Fintz,Gerard Medioni

Recent breakthroughs in synthetic data generation approaches made it possible to produce highly photorealistic images which are hardly distinguishable from real ones. Furthermore, synthetic generation pipelines have the potential to generate an unlimited number of images. The combination of high photorealism and scale turn synthetic data into a promising candidate for improving various machine learning (ML) pipelines. Thus far, a large body of research in this field has focused on using synthetic images for training, by augmenting and enlarging training data. In contrast to using synthetic data for training, in this work we explore whether synthetic data can be beneficial for model selection. Considering the task of image classification, we demonstrate that when data is scarce, synthetic data can be used to replace the held out validation set, thus allowing to train on a larger dataset. We also introduce a novel method to calibrate the synthetic error estimation to fit that of the real domain. We show that such calibration significantly improves the usefulness of synthetic data for model selection.

可辨認的 · MoDELS · 數據集 · SCAN · Learning ·

2023 年 7 月 4 日

A Synthetic Electrocardiogram (ECG) Image Generation Toolbox to Facilitate Deep Learning-Based Scanned ECG Digitization

Kshama Kodthalu Shivashankara,Reza Sameni

Access to medical data is often limited as it contains protected health information (PHI). There are privacy concerns regarding using records containing personally identifiable information. Recent advancements have been made in applying deep learning-based algorithms for clinical diagnosis and decision-making. However, deep learning models are data-greedy, whereas the availability of medical datasets for training and evaluating these models is relatively limited. Data augmentation with so-called \textit{digital twins} is an emerging technique to address this need. This paper presents a novel approach for generating synthetic electrocardiogram (ECG) images with realistic artifacts from time-series data for use in developing algorithms for digitization of ECG images. Synthetic data is generated in a privacy-preserving manner by generating distortionless ECG images on standard ECG paper background. Next, various distortions, including handwritten text artifacts, wrinkles, creases, and perspective transforms are applied to the ECG images. The artifacts are generated synthetically, without personally identifiable information. As a use case, we generated a large ECG image dataset of 21,801 records from the PhysioNet PTB-XL dataset, with 12 lead ECG time-series data from 18,869 patients. A deep ECG image digitization model was developed and trained on the synthetic dataset, and was employed to convert the synthetic images to time-series data for evaluation. The signal-to-noise ratio (SNR) was calculated to assess the image digitization quality vs the ground truth ECG time-series. The results show an average signal recovery SNR of 27$\pm$2.8\,dB, demonstrating the significance of the proposed synthetic ECG image dataset for training deep learning models.

Extensibility · Performer · 變換 · Vision · Learning ·

2023 年 3 月 8 日

Learning Imbalanced Data with Vision Transformers

Zhengzhuo Xu,Ruikang Liu,Shuo Yang,Zenghao Chai,Chun Yuan

from arxiv, Accepted to CVPR 2023, camera-ready version; Code: //github.com/XuZhengzhuo/LiVT

The real-world data tends to be heavily imbalanced and severely skew the data-driven deep neural networks, which makes Long-Tailed Recognition (LTR) a massive challenging task. Existing LTR methods seldom train Vision Transformers (ViTs) with Long-Tailed (LT) data, while the off-the-shelf pretrain weight of ViTs always leads to unfair comparisons. In this paper, we systematically investigate the ViTs' performance in LTR and propose LiVT to train ViTs from scratch only with LT data. With the observation that ViTs suffer more severe LTR problems, we conduct Masked Generative Pretraining (MGP) to learn generalized features. With ample and solid evidence, we show that MGP is more robust than supervised manners. In addition, Binary Cross Entropy (BCE) loss, which shows conspicuous performance with ViTs, encounters predicaments in LTR. We further propose the balanced BCE to ameliorate it with strong theoretical groundings. Specially, we derive the unbiased extension of Sigmoid and compensate extra logit margins to deploy it. Our Bal-BCE contributes to the quick convergence of ViTs in just a few epochs. Extensive experiments demonstrate that with MGP and Bal-BCE, LiVT successfully trains ViTs well without any additional data and outperforms comparable state-of-the-art methods significantly, e.g., our ViT-B achieves 81.0% Top-1 accuracy in iNaturalist 2018 without bells and whistles. Code is available at //github.com/XuZhengzhuo/LiVT.

數據增強 · Taxonomy · 文本分類 · Machine Learning · 訓練數據 ·

2021 年 7 月 7 日

A Survey on Data Augmentation for Text Classification

Markus Bayer,Marc-André Kaufhold,Christian Reuter

from arxiv, 35 pages, 6 figures, 8 tables

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing the generalization capabilities of a model, it can also address many other challenges and problems, from overcoming a limited amount of training data over regularizing the objective to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation (C1) and a taxonomy for existing works (C2), this survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners (C3). Derived from the taxonomy, we divided more than 100 methods into 12 different groupings and provide state-of-the-art references expounding which methods are highly promising (C4). Finally, research perspectives that may constitute a building block for future work are given (C5).

MoDELS · 數據增強 · Transformer模型 · seq2seq · 變換 ·

2020 年 3 月 4 日

Data Augmentation using Pre-trained Transformer Models

Varun Kumar,Ashutosh Choudhary,Eunah Cho

from arxiv, 7 pages

Language model based pre-trained models such as BERT have provided significant gains across different NLP tasks. In this paper, we study different types of pre-trained transformer based models such as auto-regressive models (GPT-2), auto-encoder models (BERT), and seq2seq models (BART) for conditional data augmentation. We show that prepending the class labels to text sequences provides a simple yet effective way to condition the pre-trained models for data augmentation. On three classification benchmarks, pre-trained Seq2Seq model outperforms other models. Further, we explore how different pre-trained model based data augmentation differs in-terms of data diversity, and how well such methods preserve the class-label information.

數據增強 · 泛化理論 · 矩 · 規范化的 · surge ·

2020 年 2 月 25 日

On Feature Normalization and Data Augmentation

Boyi Li,Felix Wu,Ser-Nam Lim,Serge Belongie,Kilian Q. Weinberger

Modern neural network training relies heavily on data augmentation for improved generalization. After the initial success of label-preserving augmentations, there has been a recent surge of interest in label-perturbing approaches, which combine features and labels across training samples to smooth the learned decision surface. In this paper, we propose a new augmentation method that leverages the first and second moments extracted and re-injected by feature normalization. We replace the moments of the learned features of one training image by those of another, and also interpolate the target labels. As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation methods. We demonstrate its efficacy across benchmark data sets in computer vision, speech, and natural language processing, where it consistently improves the generalization performance of highly competitive baseline networks.

圖像分割 · Networking · 損失函數（機器學習） · ReLU · 泛函 ·

2018 年 7 月 12 日

W-net: Bridged U-net for 2D Medical Image Segmentation

Wanli Chen,Yue Zhang,Junjun He,Yu Qiao,Yifan Chen,Hongjian Shi,Xiaoying Tang

In this paper, we focus on three problems in deep learning based medical image segmentation. Firstly, U-net, as a popular model for medical image segmentation, is difficult to train when convolutional layers increase even though a deeper network usually has a better generalization ability because of more learnable parameters. Secondly, the exponential ReLU (ELU), as an alternative of ReLU, is not much different from ReLU when the network of interest gets deep. Thirdly, the Dice loss, as one of the pervasive loss functions for medical image segmentation, is not effective when the prediction is close to ground truth and will cause oscillation during training. To address the aforementioned three problems, we propose and validate a deeper network that can fit medical image datasets that are usually small in the sample size. Meanwhile, we propose a new loss function to accelerate the learning process and a combination of different activation functions to improve the network performance. Our experimental results suggest that our network is comparable or superior to state-of-the-art methods.