特警力量全集免费观看-亚洲日韩中文字幕一级乱码在线播放不卡

題目： Diverse Image Generation via Self-Conditioned GANs

摘要：

本文介紹了一個簡單但有效的無監督方法，以產生現實和多樣化的圖像，并且訓練了一個類條件GAN模型，而不使用手動注釋的類標簽。相反，模型的條件是標簽自動聚類在鑒別器的特征空間。集群步驟自動發現不同的模式，并顯式地要求生成器覆蓋它們。在標準模式基準測試上的實驗表明，該方法在尋址模式崩潰時優于其他幾種競爭的方法。并且該方法在ImageNet和Places365這樣的大規模數據集上也有很好的表現，與以前的方法相比，提高了圖像多樣性和標準質量指標。

付費5元查看完整內容

相關內容

GANs

關注 5

ACL2020 · 對抗學習 · 文本生成 ·

2020 年 5 月 5 日

[付費5元查看完整內容]【ACL2020】對抗性文本生成，Improving Adversarial Text Generation

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

自回歸文本生成模型通常側重于局部的流暢性，在長文本生成過程中可能導致語義不一致。此外，自動生成具有相似語義的單詞是具有挑戰性的，而且手工編寫的語言規則很難應用。我們考慮了一個文本規劃方案，并提出了一個基于模型的模仿學習方法來緩解上述問題。具體來說，我們提出了一種新的引導網絡來關注更長的生成過程，它可以幫助下一個單詞的預測，并為生成器的優化提供中間獎勵。大量的實驗表明，該方法具有較好的性能。

付費5元查看完整內容

CVPR 2020 · 單樣本學習 · 自適應學習 · 人臉生成 ·

2020 年 4 月 6 日

[付費5元查看完整內容]【CVPR2020-Facebook AI】單樣本自適應域臉生成，One-Shot Domain Adaptation

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

在這篇論文中，我們提出了一個框架，能夠生成與給定的一次性樣例相同分布的人臉圖像。我們利用一個預先訓練的StyleGAN模型，它已經學會了一般的面部分布。針對這一一次性目標，我們提出了一種快速調整模型權值的迭代優化方案，以使輸出的高階分布適應目標的高階分布。為了生成相同分布的圖像，我們引入了一種風格混合技術，將低水平的統計信息從目標傳輸到模型隨機生成的人臉。這樣，我們就能夠生成無限數量的面孔，這些面孔既繼承了一般人臉的分布，也繼承了一次性人臉的分布。新生成的人臉可以作為其他下游任務的增強訓練數據。這樣的設置很有吸引力，因為它需要在目標域中標記很少的標記，甚至只需要一個示例，而在現實世界中，人臉操作通常是由各種未知的和獨特的分布導致的。結果表明，本文提出的單樣本自適應方法是一種有效的人臉操作檢測方法，并與其他多鏡頭自適應方法進行了定性和定量的比較。

付費5元查看完整內容

機器學習 · GAN · 生成器 ·

2020 年 3 月 28 日

[付費5元查看完整內容]【Google-Mila】你的GAN實際上是一個基于能量的模型，你應該使用鑒別器驅動的潛在采樣，Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

主題： Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

摘要： GAN的隱式生成器對數密度logp_g與鑒別器的logit分數之和定義了一個能量函數，當生成器不完善但鑒別器是最佳時，該函數產生了真實的數據密度。這使得可以改進典型發電機（具有隱式密度p_g）。我們表明，根據潛在的先驗對數密度和判別式輸出得分之和所誘導的基于能量的模型，通過對潛在空間進行采樣，可以從修改后的密度生成樣本。我們稱此過程為在潛在空間中運行Markov Chain Monte Carlo，然后應用生成器函數Discrim-inator驅動的潛在采樣（DDLS）。我們證明，與在高維像素空間中工作的先前方法相比，DDLS是高效的，并且可以用于改進先前訓練的多種類型的GAN。我們定性和定量地評估了合成和真實數據集上的DDLS。在CIFAR-10上，DDLS大大提高了現成的預訓練SN-GAN的初始得分，從8.22到9.09，與類條件BigGAN模型相當。無需引入額外的參數或額外的訓練，即可在無條件圖像合成設置中獲得最新的技術。

付費5元查看完整內容

CVPR 2020 · 對抗紋理優化 · RGB-D ·

2020 年 3 月 21 日

[付費5元查看完整內容]【CVPR2020-斯坦福】從RGB-D掃描對抗紋理優化，Adversarial Texture Optimization

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

真實的顏色紋理生成是RGB-D表面重建的一個重要步驟，但由于重建幾何形狀的不準確性、相機姿態的不正確以及與視圖相關的成像偽影，在實踐中仍然具有挑戰性。在這項工作中，我們提出了一種利用從弱監督視圖中獲得的條件對抗損失來生成顏色紋理的新方法。具體地說，我們提出了一種方法，通過學習一個目標函數來生成近似表面的真實感紋理，即使是在未對齊的圖像中。我們的方法的關鍵思想是學習一個基于補丁的條件鑒別器，它可以引導紋理優化對不匹配的容忍度。我們的鑒別器采用一個合成的視圖和一個真實的圖像，并在一個廣義的真實感定義下評估合成的圖像是否真實。我們通過提供輸入視圖的“真實”示例對及其未對齊的版本來訓練鑒別器，這樣學習到的競爭損失將能夠容忍掃描的錯誤。在定量或定性評價下對合成和真實數據進行的實驗表明，我們的方法與現有方法相比具有優勢。我們的代碼是公開的視頻演示。

付費5元查看完整內容

清華大學 · CVPR 2020 · 自適應網絡 · 分辨率 ·

2020 年 3 月 17 日

[付費5元查看完整內容]【CVPR2020-清華大學】分辨率自適應網絡的有效推理，Resolution Adaptive Networks

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

近年來，自適應推理因其計算效率高而受到越來越多的關注。不同于現有的工作,主要利用架構適應網絡冗余設計,在本文中,我們關注的空間冗余輸入樣本,并提出一種新穎的分辨率自適應網絡(RANet)。我們的動機是，低分辨率表示對于包含規范對象的“簡單”樣本的分類是足夠的，而高分辨率特征對于識別一些“困難”對象是有用的。在RANet中，輸入圖像首先被路由到一個輕量級的子網絡，這個子網絡能夠有效地提取粗糙的特征圖，并且具有高可信度預測的樣本將會很早就從這個子網絡中退出。只有那些先前預測不可靠的“硬”樣本才會激活高分辨率路徑。通過自適應地處理不同分辨率的特征，可以顯著提高RANet的計算效率。在三個分類基準測試任務(CIFAR-10、CIFAR-100和ImageNet)上的實驗證明了該模型在任意時間預測設置和預算批量分類設置中的有效性。

付費5元查看完整內容

生成對抗網絡 · 自適應學習 · 生成器 · 梯度 · 元學習 ·

2020 年 1 月 7 日

[付費5元查看完整內容]【斯坦福大學】領域自適應小樣本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

題目： DAWSON: A Domain Adaptive Few Shot Generation Framework

摘要：

為一個新領域從無到有地訓練一個可生成對抗網絡(GAN)需要大量的訓練數據和幾天的訓練時間。為此，我們提出了一種基于元學習的GANs領域自適應少鏡頭生成框架DAWSON。在GANs上應用元學習的一個主要挑戰是，由于GANs的無概率特性，通過在開發集上對生成器進行評估來獲得生成器的梯度。為了解決這一挑戰，我們提出了一個替代的GAN訓練過程，它自然地結合了GANs的兩步訓練過程和元學習算法的兩步訓練過程。DAWSON是一個即插即用的框架，它支持一個廣泛的元學習算法家族和各種具有體系結構變體的GANs。在DAWSON的基礎上，我們還提出了音樂日場，這是第一個少鏡頭的音樂生成模型。我們的實驗表明，音樂日場可以快速適應新的領域，只有幾十首歌曲從目標領域。我們還表明，DAWSON可以學習生成新的數字只有四個樣本在MNIST數據集。我們在PyTorch和Tensorflow中發布了DAWSON的源代碼實現，生成了兩種類型的音樂樣本和快閃視頻。

作者：

Weixin Liang,斯坦福大學，主要研究方向為自然語言處理，對話系統，計算機系統。

付費5元查看完整內容

正則化項 · 模式崩潰 · 峰值 · 生成式對抗網絡 · Networking ·

2019 年 3 月 18 日

Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis

Qi Mao,Hsin-Ying Lee,Hung-Yu Tseng,Siwei Ma,Ming-Hsuan Yang

from arxiv, CVPR 2019. Code: //github.com/HelenMao/MSGAN

Most conditional generation tasks expect diverse outputs given a single conditional context. However, conditional generative adversarial networks (cGANs) often focus on the prior conditional information and ignore the input noise vectors, which contribute to the output variations. Recent attempts to resolve the mode collapse issue for cGANs are usually task-specific and computationally expensive. In this work, we propose a simple yet effective regularization term to address the mode collapse issue for cGANs. The proposed method explicitly maximizes the ratio of the distance between generated images with respect to the corresponding latent codes, thus encouraging the generators to explore more minor modes during training. This mode seeking regularization term is readily applicable to various conditional generation tasks without imposing training overhead or modifying the original network structures. We validate the proposed algorithm on three conditional image synthesis tasks including categorical generation, image-to-image translation, and text-to-image synthesis with different baseline models. Both qualitative and quantitative results demonstrate the effectiveness of the proposed regularization method for improving diversity without loss of quality.

分解的 · 學成 · AIM · 特征空間 · MoDELS ·

2018 年 1 月 21 日

Disentangled Person Image Generation

Liqian Ma,Qianru Sun,Stamatios Georgoulis,Luc Van Gool,Bernt Schiele,Mario Fritz

Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information. In this work, we aim at generating such images based on a novel, two-stage reconstruction pipeline that learns a disentangled representation of the aforementioned image factors and generates novel person images at the same time. First, a multi-branched reconstruction network is proposed to disentangle and encode the three factors into embedding features, which are then combined to re-compose the input image itself. Second, three corresponding mapping functions are learned in an adversarial manner in order to map Gaussian noise to the learned embedding feature space, for each factor respectively. Using the proposed framework, we can manipulate the foreground, background and pose of the input image, and also sample new embedding features to generate such targeted manipulations, that provide more control over the generation process. Experiments on Market-1501 and Deepfashion datasets show that our model does not only generate realistic person images with new foregrounds, backgrounds and poses, but also manipulates the generated factors and interpolates the in-between states. Another set of experiments on Market-1501 shows that our model can also be beneficial for the person re-identification task.