亚洲黄色网站不卡免费_啊灬啊灬啊灬快灬深用两性_婷婷综合四房播播五月天_无码老熟妇色视频_亚洲一级二级在线观看呢_韩国主播视频一区二区三区_国产A国片精品青草视频

Image inpainting aims to complete the missing or corrupted regions of images with realistic contents. The prevalent approaches adopt a hybrid objective of reconstruction and perceptual quality by using generative adversarial networks. However, the reconstruction loss and adversarial loss focus on synthesizing contents of different frequencies and simply applying them together often leads to inter-frequency conflicts and compromised inpainting. This paper presents WaveFill, a wavelet-based inpainting network that decomposes images into multiple frequency bands and fills the missing regions in each frequency band separately and explicitly. WaveFill decomposes images by using discrete wavelet transform (DWT) that preserves spatial information naturally. It applies L1 reconstruction loss to the decomposed low-frequency bands and adversarial loss to high-frequency bands, hence effectively mitigate inter-frequency conflicts while completing images in spatial domain. To address the inpainting inconsistency in different frequency bands and fuse features with distinct statistics, we design a novel normalization scheme that aligns and fuses the multi-frequency features effectively. Extensive experiments over multiple datasets show that WaveFill achieves superior image inpainting qualitatively and quantitatively.

相關內容

圖(tu)像修復

關注 53

圖像(xiang)(xiang)修(xiu)復（英語：Inpainting）指重建(jian)的(de)(de)(de)圖像(xiang)(xiang)和(he)視頻中丟失(shi)(shi)或損(sun)壞的(de)(de)(de)部分的(de)(de)(de)過程。例(li)如在(zai)博物(wu)館(guan)(guan)中，這項(xiang)工作常(chang)由經驗豐富的(de)(de)(de)博物(wu)館(guan)(guan)管理員或者藝術品修(xiu)復師(shi)來進行。數(shu)碼世界(jie)中，圖像(xiang)(xiang)修(xiu)復又(you)稱圖像(xiang)(xiang)插值(zhi)或視頻插值(zhi)，指利用復雜的(de)(de)(de)算法(fa)來替換(huan)已丟失(shi)(shi)、損(sun)壞的(de)(de)(de)圖像(xiang)(xiang)數(shu)據，主要替換(huan)一(yi)些小區域和(he)瑕疵(ci)。

2019 年 3 月 8 日

Knowledge-Embedded Routing Network for Scene Graph Generation

Tianshui Chen,Weihao Yu,Riquan Chen,Liang Lin

from arxiv, Accepted by CVPR 2019

To understand a scene in depth not only involves locating/recognizing individual objects, but also requires to infer the relationships and interactions among them. However, since the distribution of real-world relationships is seriously unbalanced, existing methods perform quite poorly for the less frequent relationships. In this work, we find that the statistical correlations between object pairs and their relationships can effectively regularize semantic space and make prediction less ambiguous, and thus well address the unbalanced distribution issue. To achieve this, we incorporate these statistical correlations into deep neural networks to facilitate scene graph generation by developing a Knowledge-Embedded Routing Network. More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions. Extensive experiments on the large-scale Visual Genome dataset demonstrate the superiority of the proposed method over current state-of-the-art competitors.

圖像修復 · INFORMS · Guidance · Performer · CASES ·

2019 年 1 月 17 日

Foreground-aware Image Inpainting

Wei Xiong,Zhe Lin,Jimei Yang,Xin Lu,Connelly Barnes,Jiebo Luo

Existing image inpainting methods typically fill holes by borrowing information from surrounding image regions. They often produce unsatisfactory results when the holes overlap with or touch foreground objects due to lack of information about the actual extent of foreground and background regions within the holes. These scenarios, however, are very important in practice, especially for applications such as distracting object removal. To address the problem, we propose a foreground-aware image inpainting system that explicitly disentangles structure inference and content completion. Specifically, our model learns to predict the foreground contour first, and then inpaints the missing region using the predicted contour as guidance. We show that by this disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting. Experiments show that our method significantly outperforms existing methods and achieves superior inpainting results on challenging cases with complex compositions.

圖 · Networking · Neural Networks · 生成器網絡 · Performer ·

2019 年 1 月 15 日

Using Scene Graph Context to Improve Image Generation

Subarna Tripathi,Anahita Bhiwandiwalla,Alexei Bastidas,Hanlin Tang

from arxiv, arXiv admin note: text overlap with arXiv:1804.01622 by other authors

Generating realistic images from scene graphs asks neural networks to be able to reason about object relationships and compositionality. As a relatively new task, how to properly ensure the generated images comply with scene graphs or how to measure task performance remains an open question. In this paper, we propose to harness scene graph context to improve image generation from scene graphs. We introduce a scene graph context network that pools features generated by a graph convolutional neural network that are then provided to both the image generation network and the adversarial loss. With the context network, our model is trained to not only generate realistic looking images, but also to better preserve non-spatial object relationships. We also define two novel evaluation metrics, the relation score and the mean opinion relation score, for this task that directly evaluate scene graph compliance. We use both quantitative and qualitative studies to demonstrate that our pro-posed model outperforms the state-of-the-art on this challenging task.

MoDELS · 學成 · 類別 · Networking · Performer ·

2018 年 11 月 12 日

Generative Dual Adversarial Network for Generalized Zero-shot Learning

He Huang,Changhu Wang,Philip S. Yu,Chang-Dong Wang

This paper studies the problem of generalized zero-shot learning which requires the model to train on image-label pairs from some seen classes and test on the task of classifying new images from both seen and unseen classes. Most previous models try to learn a fixed one-directional mapping between visual and semantic space, while some recently proposed generative methods try to generate image features for unseen classes so that the zero-shot learning problem becomes a traditional fully-supervised classification problem. In this paper, we propose a novel model that provides a unified framework for three different approaches: visual-> semantic mapping, semantic->visual mapping, and metric learning. Specifically, our proposed model consists of a feature generator that can generate various visual features given class embeddings as input, a regressor that maps each visual feature back to its corresponding class embedding, and a discriminator that learns to evaluate the closeness of an image feature and a class embedding. All three components are trained under the combination of cyclic consistency loss and dual adversarial loss. Experimental results show that our model not only preserves higher accuracy in classifying images from seen classes, but also performs better than existing state-of-the-art models in in classifying images from unseen classes.

混合專家模型 · 生成器網絡 · Networking · 多峰值 · 生成式對抗網絡 ·

2018 年 5 月 8 日

MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation

David Keetae Park,Seungjoo Yoo,Hyojin Bahng,Jaegul Choo,Noseong Park

from arxiv, 27th International Joint Conference on Artificial Intelligence (IJCAI 2018)

Recently, generative adversarial networks (GANs) have shown promising performance in generating realistic images. However, they often struggle in learning complex underlying modalities in a given dataset, resulting in poor-quality generated images. To mitigate this problem, we present a novel approach called mixture of experts GAN (MEGAN), an ensemble approach of multiple generator networks. Each generator network in MEGAN specializes in generating images with a particular subset of modalities, e.g., an image class. Instead of incorporating a separate step of handcrafted clustering of multiple modalities, our proposed model is trained through an end-to-end learning of multiple generators via gating networks, which is responsible for choosing the appropriate generator network for a given condition. We adopt the categorical reparameterization trick for a categorical decision to be made in selecting a generator while maintaining the flow of the gradients. We demonstrate that individual generators learn different and salient subparts of the data and achieve a multiscale structural similarity (MS-SSIM) score of 0.2470 for CelebA and a competitive unsupervised inception score of 8.33 in CIFAR-10.

Re-ID · MoDELS · 學成 · 訓練數據 · GAN ·

2018 年 4 月 25 日

Pose-Normalized Image Generation for Person Re-identification

Xuelin Qian,Yanwei Fu,Tao Xiang,Wenxuan Wang,Jie Qiu,Yang Wu,Yu-Gang Jiang,Xiangyang Xue

from arxiv, 10 pages, 5 figures

Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations. In this work, we address both problems by proposing a novel deep person image generation model for synthesizing realistic person images conditional on the pose. The model is based on a generative adversarial network (GAN) designed specifically for pose normalization in re-id, thus termed pose-normalization GAN (PN-GAN). With the synthesized images, we can learn a new type of deep re-id feature free of the influence of pose variations. We show that this feature is strong on its own and complementary to features learned with the original images. Importantly, under the transfer learning setting, we show that our model generalizes well to any new re-id dataset without the need for collecting any training data for model fine-tuning. The model thus has the potential to make re-id model truly scalable.

注意力機制 · Networking · 生成器網絡 · 判別網絡 · 判別器 ·

2018 年 4 月 1 日

Attentive Generative Adversarial Network for Raindrop Removal from a Single Image

Rui Qian,Robby T. Tan,Wenhan Yang,Jiajun Su,Jiaying Liu

from arxiv, CVPR2018 Spotlight

Raindrops adhered to a glass window or camera lens can severely hamper the visibility of a background scene and degrade an image considerably. In this paper, we address the problem by visually removing raindrops, and thus transforming a raindrop degraded image into a clean one. The problem is intractable, since first the regions occluded by raindrops are not given. Second, the information about the background scene of the occluded regions is completely lost for most part. To resolve the problem, we apply an attentive generative network using adversarial training. Our main idea is to inject visual attention into both the generative and discriminative networks. During the training, our visual attention learns about raindrop regions and their surroundings. Hence, by injecting this information, the generative network will pay more attention to the raindrop regions and the surrounding structures, and the discriminative network will be able to assess the local consistency of the restored regions. This injection of visual attention to both generative and discriminative networks is the main contribution of this paper. Our experiments show the effectiveness of our approach, which outperforms the state of the art methods quantitatively and qualitatively.

分解的 · 學成 · AIM · 特征空間 · MoDELS ·

2018 年 1 月 21 日

Disentangled Person Image Generation

Liqian Ma,Qianru Sun,Stamatios Georgoulis,Luc Van Gool,Bernt Schiele,Mario Fritz

Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information. In this work, we aim at generating such images based on a novel, two-stage reconstruction pipeline that learns a disentangled representation of the aforementioned image factors and generates novel person images at the same time. First, a multi-branched reconstruction network is proposed to disentangle and encode the three factors into embedding features, which are then combined to re-compose the input image itself. Second, three corresponding mapping functions are learned in an adversarial manner in order to map Gaussian noise to the learned embedding feature space, for each factor respectively. Using the proposed framework, we can manipulate the foreground, background and pose of the input image, and also sample new embedding features to generate such targeted manipulations, that provide more control over the generation process. Experiments on Market-1501 and Deepfashion datasets show that our model does not only generate realistic person images with new foregrounds, backgrounds and poses, but also manipulates the generated factors and interpolates the in-between states. Another set of experiments on Market-1501 shows that our model can also be beneficial for the person re-identification task.

逼真度 · GANs · 多樣性 · 控制器 · 全 ·

2018 年 1 月 17 日

Semi-supervised FusedGAN for Conditional Image Generation

Navaneeth Bodla,Gang Hua,Rama Chellappa

We present FusedGAN, a deep network for conditional image synthesis with controllable sampling of diverse images. Fidelity, diversity and controllable sampling are the main quality measures of a good image generation model. Most existing models are insufficient in all three aspects. The FusedGAN can perform controllable sampling of diverse images with very high fidelity. We argue that controllability can be achieved by disentangling the generation process into various stages. In contrast to stacked GANs, where multiple stages of GANs are trained separately with full supervision of labeled intermediate images, the FusedGAN has a single stage pipeline with a built-in stacking of GANs. Unlike existing methods, which requires full supervision with paired conditions and images, the FusedGAN can effectively leverage more abundant images without corresponding conditions in training, to produce more diverse samples with high fidelity. We achieve this by fusing two generators: one for unconditional image generation, and the other for conditional image generation, where the two partly share a common latent space thereby disentangling the generation. We demonstrate the efficacy of the FusedGAN in fine grained image generation tasks such as text-to-image, and attribute-to-face generation.

生成式對抗網絡 · Networking · 層 · MoDELS · DCGAN ·

2017 年 8 月 2 日

LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation

Jianwei Yang,Anitha Kannan,Dhruv Batra,Devi Parikh

from arxiv, 21 pages, 22 figures, published as a conference paper at ICLR 2017, code available on GitHub

We present LR-GAN: an adversarial image generation model which takes scene structure and context into account. Unlike previous generative adversarial networks (GANs), the proposed GAN learns to generate image background and foregrounds separately and recursively, and stitch the foregrounds on the background in a contextually relevant manner to produce a complete natural image. For each foreground, the model learns to generate its appearance, shape and pose. The whole model is unsupervised, and is trained in an end-to-end manner with gradient descent methods. The experiments demonstrate that LR-GAN can generate more natural images with objects that are more human recognizable than DCGAN.