亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Adversarial learning-based image defogging methods have been extensively studied in computer vision due to their remarkable performance. However, most existing methods have limited defogging capabilities for real cases because they are trained on the paired clear and synthesized foggy images of the same scenes. In addition, they have limitations in preserving vivid color and rich textual details in defogging. To address these issues, we develop a novel generative adversarial network, called holistic attention-fusion adversarial network (HAAN), for single image defogging. HAAN consists of a Fog2Fogfree block and a Fogfree2Fog block. In each block, there are three learning-based modules, namely, fog removal, color-texture recovery, and fog synthetic, that are constrained each other to generate high quality images. HAAN is designed to exploit the self-similarity of texture and structure information by learning the holistic channel-spatial feature correlations between the foggy image with its several derived images. Moreover, in the fog synthetic module, we utilize the atmospheric scattering model to guide it to improve the generative quality by focusing on an atmospheric light optimization with a novel sky segmentation network. Extensive experiments on both synthetic and real-world datasets show that HAAN outperforms state-of-the-art defogging methods in terms of quantitative accuracy and subjective visual quality.

相關內容

iOS 8 提供的應用間和應用跟系統的功能交互特性。
  • Today (iOS and OS X): widgets for the Today view of Notification Center
  • Share (iOS and OS X): post content to web services or share content with others
  • Actions (iOS and OS X): app extensions to view or manipulate inside another app
  • Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
  • Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
  • Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
  • Custom Keyboard (iOS): system-wide alternative keyboards

Source:

Image super-resolution (SR) techniques are used to generate a high-resolution image from a low-resolution image. Until now, deep generative models such as autoregressive models and Generative Adversarial Networks (GANs) have proven to be effective at modelling high-resolution images. VAE-based models have often been criticised for their feeble generative performance, but with new advancements such as VDVAE, there is now strong evidence that deep VAEs have the potential to outperform current state-of-the-art models for high-resolution image generation. In this paper, we introduce VDVAE-SR, a new model that aims to exploit the most recent deep VAE methodologies to improve upon the results of similar models. VDVAE-SR tackles image super-resolution using transfer learning on pretrained VDVAEs. The presented model is competitive with other state-of-the-art models, having comparable results on image quality metrics.

Stain color variation in histological images, caused by a variety of factors, is a challenge not only for the visual diagnosis of pathologists but also for cell segmentation algorithms. To eliminate the color variation, many stain normalization approaches have been proposed. However, most were designed for hematoxylin and eosin staining images and performed poorly on immunohistochemical staining images. Current cell segmentation methods systematically apply stain normalization as a preprocessing step, but the impact brought by color variation has not been quantitatively investigated yet. In this paper, we produced five groups of NeuN staining images with different colors. We applied a deep learning image-recoloring method to perform color transfer between histological image groups. Finally, we altered the color of a segmentation set and quantified the impact of color variation on cell segmentation. The results demonstrated the necessity of color normalization prior to subsequent analysis.

Neural network related methods, due to their unprecedented success in image processing, have emerged as a new set of tools in CT reconstruction with the potential to change the field. However, the lack of high-quality training data and theoretical guarantees, together with increasingly complicated network structures, make its implementation impractical. In this paper, we present a new framework (RBP-DIP) based on Deep Image Prior (DIP) and a special residual back projection (RBP) connection to tackle these challenges. Comparing to other pre-trained neural network related algorithms, the proposed framework is closer to an iterative reconstruction (IR) algorithm as it requires no training data or training process. In that case, the proposed framework can be altered (e.g, different hyperparameters and constraints) on demand, adapting to different conditions (e.g, different imaged objects, imaging instruments, and noise levels) without retraining. Experiments show that the proposed framework has significant improvements over other state-of-the-art conventional methods, as well as pre-trained and untrained models with similar network structures, especially under sparse-view, limited-angle, and low-dose conditions.

Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties with high spatial resolution. However, previous attempts to solve the optical inverse problem with supervised machine learning were hampered by the absence of labeled reference data. While this bottleneck has been tackled by simulating training data, the domain gap between real and simulated images remains an unsolved challenge. We propose a novel approach to PAT image synthesis that involves subdividing the challenge of generating plausible simulations into two disjoint problems: (1) Probabilistic generation of realistic tissue morphology, and (2) pixel-wise assignment of corresponding optical and acoustic properties. The former is achieved with Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data. According to a validation study on a downstream task our approach yields more realistic synthetic images than the traditional model-based approach and could therefore become a fundamental step for deep learning-based quantitative PAT (qPAT).

Few-Shot Object Detection (FSOD) methods are mainly designed and evaluated on natural image datasets such as Pascal VOC and MS COCO. However, it is not clear whether the best methods for natural images are also the best for aerial images. Furthermore, direct comparison of performance between FSOD methods is difficult due to the wide variety of detection frameworks and training strategies. Therefore, we propose a benchmarking framework that provides a flexible environment to implement and compare attention-based FSOD methods. The proposed framework focuses on attention mechanisms and is divided into three modules: spatial alignment, global attention, and fusion layer. To remain competitive with existing methods, which often leverage complex training, we propose new augmentation techniques designed for object detection. Using this framework, several FSOD methods are reimplemented and compared. This comparison highlights two distinct performance regimes on aerial and natural images: FSOD performs worse on aerial images. Our experiments suggest that small objects, which are harder to detect in the few-shot setting, account for the poor performance. Finally, we develop a novel multiscale alignment method, Cross-Scales Query-Support Alignment (XQSA) for FSOD, to improve the detection of small objects. XQSA outperforms the state-of-the-art significantly on DOTA and DIOR.

The GAN-based infrared and visible image fusion methods have gained ever-increasing attention due to its effectiveness and superiority. However, the existing methods adopt the global pixel distribution of source images as the basis for discrimination, which fails to focus on the key modality information. Moreover, the dual-discriminator based methods suffer from the confrontation between the discriminators. To this end, we propose an attention-guided and wavelet-constrained GAN for infrared and visible image fusion (AWFGAN). In this method, two unique discrimination strategies are designed to improve the fusion performance. Specifically, we introduce the spatial attention modules (SAM) into the generator to obtain the spatial attention maps, and then the attention maps are utilized to force the discrimination of infrared images to focus on the target regions. In addition, we extend the discrimination range of visible information to the wavelet subspace, which can force the generator to restore the high-frequency details of visible images. Ablation experiments demonstrate the effectiveness of our method in eliminating the confrontation between discriminators. And the comparison experiments on public datasets demonstrate the effectiveness and superiority of the proposed method.

Despite unconditional feature inversion being the foundation of many image synthesis applications, training an inverter demands a high computational budget, large decoding capacity and imposing conditions such as autoregressive priors. To address these limitations, we propose the use of adversarially robust representations as a perceptual primitive for feature inversion. We train an adversarially robust encoder to extract disentangled and perceptually-aligned image representations, making them easily invertible. By training a simple generator with the mirror architecture of the encoder, we achieve superior reconstruction quality and generalization over standard models. Based on this, we propose an adversarially robust autoencoder and demonstrate its improved performance on style transfer, image denoising and anomaly detection tasks. Compared to recent ImageNet feature inversion methods, our model attains improved performance with significantly less complexity.

Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols. A pervasive approach for synthesizing target images involves one-shot mapping through generative adversarial networks (GAN). Yet, GAN models that implicitly characterize the image distribution can suffer from limited sample fidelity. Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation. To capture a direct correlate of the image distribution, SynDiff leverages a conditional diffusion process that progressively maps noise and source images onto the target image. For fast and accurate image sampling during inference, large diffusion steps are taken with adversarial projections in the reverse diffusion direction. To enable training on unpaired datasets, a cycle-consistent architecture is devised with coupled diffusive and non-diffusive modules that bilaterally translate between two modalities. Extensive assessments are reported on the utility of SynDiff against competing GAN and diffusion models in multi-contrast MRI and MRI-CT translation. Our demonstrations indicate that SynDiff offers quantitatively and qualitatively superior performance against competing baselines.

We present a new method to learn video representations from large-scale unlabeled video data. Ideally, this representation will be generic and transferable, directly usable for new tasks such as action recognition and zero or few-shot learning. We formulate unsupervised representation learning as a multi-modal, multi-task learning problem, where the representations are shared across different modalities via distillation. Further, we introduce the concept of loss function evolution by using an evolutionary search algorithm to automatically find optimal combination of loss functions capturing many (self-supervised) tasks and modalities. Thirdly, we propose an unsupervised representation evaluation metric using distribution matching to a large unlabeled dataset as a prior constraint, based on Zipf's law. This unsupervised constraint, which is not guided by any labeling, produces similar results to weakly-supervised, task-specific ones. The proposed unsupervised representation learning results in a single RGB network and outperforms previous methods. Notably, it is also more effective than several label-based methods (e.g., ImageNet), with the exception of large, fully labeled video datasets.

This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.

北京阿比特科技有限公司