国产一本二本三本的区别视频-亚洲国产原创精品国语一区

from arxiv, To appear in L. Calatroni, M. Donatelli, S. Morigi, M. Prato, M. Santavesaria (Eds.): Scale Space and Variational Methods in Computer Vision. Lecture Notes in Computer Science, Springer, Cham, 2023

With well-selected data, homogeneous diffusion inpainting can reconstruct images from sparse data with high quality. While 4K colour images of size 3840 x 2160 can already be inpainted in real time, optimising the known data for applications like image compression remains challenging: Widely used stochastic strategies can take days for a single 4K image. Recently, a first neural approach for this so-called mask optimisation problem offered high speed and good quality for small images. It trains a mask generation network with the help of a neural inpainting surrogate. However, these mask networks can only output masks for the resolution and mask density they were trained for. We solve these problems and enable mask optimisation for high-resolution images through a neuroexplicit coarse-to-fine strategy. Additionally, we improve the training and interpretability of mask networks by including a numerical inpainting solver directly into the network. This allows to generate masks for 4K images in around 0.6 seconds while exceeding the quality of stochastic methods on practically relevant densities. Compared to popular existing approaches, this is an acceleration of up to four orders of magnitude.

相關內容

圖像修復

關注 53

圖像修復（英語：Inpainting）指重建的圖像和視頻中丟失或損壞的部分的過程。例如在博物館中，這項工作常由經驗豐富的博物館管理員或者藝術品修復師來進行。數碼世界中，圖像修復又稱圖像插值或視頻插值，指利用復雜的算法來替換已丟失、損壞的圖像數據，主要替換一些小區域和瑕疵。

Airfoil · PDE · 近似 · 圖形處理器 · Networking ·

2023 年 5 月 9 日

Graph Neural Networks for Airfoil Design

Florent Bonnet

The study of partial differential equations (PDE) through the framework of deep learning emerged a few years ago leading to the impressive approximations of simple dynamics. Graph neural networks (GNN) turned out to be very useful in those tasks by allowing the treatment of unstructured data often encountered in the field of numerical resolutions of PDE. However, the resolutions of harder PDE such as Navier-Stokes equations are still a challenging task and most of the work done on the latter concentrate either on simulating the flow around simple geometries or on qualitative results that looks physical for design purpose. In this study, we try to leverage the work done on deep learning for PDE and GNN by proposing an adaptation of a known architecture in order to tackle the task of approximating the solution of the two-dimensional steady-state incompressible Navier-Stokes equations over different airfoil geometries. In addition to that, we test our model not only on its performance over the volume but also on its performance to approximate surface quantities such as the wall shear stress or the isostatic pressure leading to the inference of global coefficients such as the lift and the drag of our airfoil in order to allow design exploration. This work takes place in a longer project that aims to approximate three dimensional steady-state solutions over industrial geometries.

MoDELS · Performer · 變換 · 噪聲 · 去噪 ·

2023 年 5 月 9 日

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Hongyi Yuan,Zheng Yuan,Chuanqi Tan,Fei Huang,Songfang Huang

from arxiv, Under Review

Diffusion model, a new generative modelling paradigm, has achieved great success in image, audio, and video generation. However, considering the discrete categorical nature of text, it is not trivial to extend continuous diffusion models to natural language, and text diffusion models are less studied. Sequence-to-sequence text generation is one of the essential natural language processing topics. In this work, we apply diffusion models to approach sequence-to-sequence text generation, and explore whether the superiority generation performance of diffusion model can transfer to natural language domain. We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation. SeqDiffuSeq uses an encoder-decoder Transformers architecture to model denoising function. In order to improve generation quality, SeqDiffuSeq combines the self-conditioning technique and a newly proposed adaptive noise schedule technique. The adaptive noise schedule has the difficulty of denoising evenly distributed across time steps, and considers exclusive noise schedules for tokens at different positional order. Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.

圖 · MoDELS · 邊 · 離散化 · 統計量 ·

2023 年 5 月 9 日

Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling

Xiaohui Chen,Jiaxing He,Xu Han,Li-Ping Liu

from arxiv, ICML 2023, fix typos in an experiment table

Diffusion-based generative graph models have been proven effective in generating high-quality small graphs. However, they need to be more scalable for generating large graphs containing thousands of nodes desiring graph statistics. In this work, we propose EDGE, a new diffusion-based generative graph model that addresses generative tasks with large graphs. To improve computation efficiency, we encourage graph sparsity by using a discrete diffusion process that randomly removes edges at each time step and finally obtains an empty graph. EDGE only focuses on a portion of nodes in the graph at each denoising step. It makes much fewer edge predictions than previous diffusion-based models. Moreover, EDGE admits explicitly modeling the node degrees of the graphs, further improving the model performance. The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by our approach have more similar graph statistics to those of the training graphs.

Performer · 逼真度 · Learning · MoDELS · 機器人操作平臺 ·

2023 年 5 月 7 日

Learning from synthetic data generated with GRADE

Elia Bonetto,Chenghao Xu,Aamir Ahmad

from arxiv, 5 pages, 5 tables, 2 figures. Accepted for the ICRA 2023 workshop Pretraining for Robotics (//microsoft.github.io/robotics.pretraining.workshop.icra/). arXiv admin note: substantial text overlap with arXiv:2303.04466

Recently, synthetic data generation and realistic rendering has advanced tasks like target tracking and human pose estimation. Simulations for most robotics applications are obtained in (semi)static environments, with specific sensors and low visual fidelity. To solve this, we present a fully customizable framework for generating realistic animated dynamic environments (GRADE) for robotics research, first introduced in [1]. GRADE supports full simulation control, ROS integration, realistic physics, while being in an engine that produces high visual fidelity images and ground truth data. We use GRADE to generate a dataset focused on indoor dynamic scenes with people and flying objects. Using this, we evaluate the performance of YOLO and Mask R-CNN on the tasks of segmenting and detecting people. Our results provide evidence that using data generated with GRADE can improve the model performance when used for a pre-training step. We also show that, even training using only synthetic data, can generalize well to real-world images in the same application domain such as the ones from the TUM-RGBD dataset. The code, results, trained models, and the generated data are provided as open-source at //eliabntt.github.io/grade-rr.

Performer · Better · 穩健性 · 相似度 · 解碼 ·

2023 年 5 月 5 日

Evading Watermark based Detection of AI-Generated Content

Zhengyuan Jiang,Jinghuai Zhang,Neil Zhenqiang Gong

A generative AI model -- such as DALL-E, Stable Diffusion, and ChatGPT -- can generate extremely realistic-looking content, posing growing challenges to the authenticity of information. To address the challenges, watermark has been leveraged to detect AI-generated content. Specifically, a watermark is embedded into an AI-generated content before it is released. A content is detected as AI-generated if a similar watermark can be decoded from it. In this work, we perform a systematic study on the robustness of such watermark-based AI-generated content detection. We focus on AI-generated images. Our work shows that an attacker can post-process an AI-generated watermarked image via adding a small, human-imperceptible perturbation to it, such that the post-processed AI-generated image evades detection while maintaining its visual quality. We demonstrate the effectiveness of our attack both theoretically and empirically. Moreover, to evade detection, our adversarial post-processing method adds much smaller perturbations to the AI-generated images and thus better maintain their visual quality than existing popular image post-processing methods such as JPEG compression, Gaussian blur, and Brightness/Contrast. Our work demonstrates the insufficiency of existing watermark-based detection of AI-generated content, highlighting the urgent needs of new detection methods.

Networking · MoDELS · 求逆 · 轉移概率 · GANs ·

2023 年 5 月 5 日

Generative Steganography Diffusion

Ping Wei,Qing Zhou,Zichi Wang,Zhenxing Qian,Xinpeng Zhang,Sheng Li

from arxiv, Draft for ACM-mm 2023.Shall not be reproduced without permission, rights reserved!

Generative steganography (GS) is an emerging technique that generates stego images directly from secret data. Various GS methods based on GANs or Flow have been developed recently. However, existing GAN-based GS methods cannot completely recover the hidden secret data due to the lack of network invertibility, while Flow-based methods produce poor image quality due to the stringent reversibility restriction in each module. To address this issue, we propose a novel GS scheme called "Generative Steganography Diffusion" (GSD) by devising an invertible diffusion model named "StegoDiffusion". It not only generates realistic stego images but also allows for 100\% recovery of the hidden secret data. The proposed StegoDiffusion model leverages a non-Markov chain with a fast sampling technique to achieve efficient stego image generation. By constructing an ordinary differential equation (ODE) based on the transition probability of the generation process in StegoDiffusion, secret data and stego images can be converted to each other through the approximate solver of ODE -- Euler iteration formula, enabling the use of irreversible but more expressive network structures to achieve model invertibility. Our proposed GSD has the advantages of both reversibility and high performance, significantly outperforming existing GS methods in all metrics.

GANs · Taxonomy · Vision · 生成式對抗網絡 · 計算機視覺 ·

2020 年 12 月 21 日

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Zhengwei Wang,Qi She,Tomas E. Ward

from arxiv, Accepted by ACM Computing Surveys, 23 November 2020

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are: (1) the generation of high quality images, (2) diversity of image generation, and (3) stable training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state of the art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress towards important computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Code related to GAN-variants studied in this work is summarized on //github.com/sheqi/GAN_Review.

GANs · Taxonomy · Extensibility · Vision · Networks ·

2019 年 6 月 4 日

Generative Adversarial Networks: A Survey and Taxonomy

Zhengwei Wang,Qi She,Tomas E. Ward

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably the revolutionary techniques are in the area of computer vision such as plausible image generation, image to image translation, facial attribute manipulation and similar domains. Despite the significant success achieved in computer vision field, applying GANs over real-world problems still have three main challenges: (1) High quality image generation; (2) Diverse image generation; and (3) Stable training. Considering numerous GAN-related research in the literature, we provide a study on the architecture-variants and loss-variants, which are proposed to handle these three challenges from two perspectives. We propose loss and architecture-variants for classifying most popular GANs, and discuss the potential improvements with focusing on these two aspects. While several reviews for GANs have been presented, there is no work focusing on the review of GAN-variants based on handling challenges mentioned above. In this paper, we review and critically discuss 7 architecture-variant GANs and 9 loss-variant GANs for remedying those three challenges. The objective of this review is to provide an insight on the footprint that current GANs research focuses on the performance improvement. Code related to GAN-variants studied in this work is summarized on //github.com/sheqi/GAN_Review.

FCN · 全卷積網絡 · 3D · 級聯 · MoDELS ·

2018 年 3 月 20 日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Holger R. Roth,Hirohisa Oda,Xiangrong Zhou,Natsuki Shimizu,Ying Yang,Yuichiro Hayashi,Masahiro Oda,Michitaka Fujiwara,Kazunari Misawa,Kensaku Mori

from arxiv, Preprint accepted for publication in Computerized Medical Imaging and Graphics. Substantial extension of arXiv:1704.06382; Corrected references to figure numbers in this version

Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.

Networking · Neural Networks · 生成式對抗網絡 · 3D · 圖像分割 ·

2018 年 1 月 22 日

Fluorescence Microscopy Image Segmentation Using Convolutional Neural Network With Generative Adversarial Networks

Chichen Fu,Soonam Lee,David Joon Ho,Shuo Han,Paul Salama,Kenneth W. Dunn,Edward J. Delp

Recent advance in fluorescence microscopy enables acquisition of 3D image volumes with better quality and deeper penetration into tissue. Segmentation is a required step to characterize and analyze biological structures in the images. 3D segmentation using deep learning has achieved promising results in microscopy images. One issue is that deep learning techniques require a large set of groundtruth data which is impractical to annotate manually for microscopy volumes. This paper describes a 3D nuclei segmentation method using 3D convolutional neural networks. A set of synthetic volumes and the corresponding groundtruth volumes are generated automatically using a generative adversarial network. Segmentation results demonstrate that our proposed method is capable of segmenting nuclei successfully in 3D for various data sets.