亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<li id='gwSgF'></li>

_{^{<dd id='yJeaA'><tbody id='1AwaQ'><td id='WwvEW'><optgroup id='3Lk3k'><strong id='aYSVs'></strong></optgroup><address id='0tftt'><ul id='lw6H4'></ul></address><big id='t2eOE'></big></td><table id='nBL3K'></table></tbody><pre id='xzGRV'></pre></dd><span id='sOVox'><b id='J2tSB'></b></span>}}


<dfn id='Vbsp4'><optgroup id='gphEm'></optgroup></dfn><tfoot id='ptTRI'><bdo id='wxgTr'><div id='zHCNr'></div><i id='0Lfoo'><dt id='dxIyE'></dt></i></bdo></tfoot>

_{<fieldset id='yEmDV'></fieldset>}

·

Performer · Extensibility · MoDELS · CASES · state-of-the-art ·

2021 年 10 月 12 日

COMISR: Compression-Informed Video Super-Resolution

Yinxiao Li,Pengchong Jin,Feng Yang,Ce Liu,Ming-Hsuan Yang,Peyman Milanfar

from arxiv, 13 pages, 10 figures

Most video super-resolution methods focus on restoring high-resolution video frames from low-resolution videos without taking into account compression. However, most videos on the web or mobile devices are compressed, and the compression can be severe when the bandwidth is limited. In this paper, we propose a new compression-informed video super-resolution model to restore high-resolution content without introducing artifacts caused by compression. The proposed model consists of three modules for video super-resolution: bi-directional recurrent warping, detail-preserving flow estimation, and Laplacian enhancement. All these three modules are used to deal with compression properties such as the location of the intra-frames in the input and smoothness in the output frames. For thorough performance evaluation, we conducted extensive experiments on standard datasets with a wide range of compression rates, covering many real video use cases. We showed that our method not only recovers high-resolution content on uncompressed frames from the widely-used benchmark datasets, but also achieves state-of-the-art performance in super-resolving compressed videos based on numerous quantitative metrics. We also evaluated the proposed method by simulating streaming from YouTube to demonstrate its effectiveness and robustness. The source codes and trained models are available at //github.com/google-research/google-research/tree/master/comisr.

相關內容

Performer

數據集 · 縮放 · Performer · AIM · MoDELS ·

2021 年 12 月 3 日

Action Units That Constitute Trainable Micro-expressions (and A Large-scale Synthetic Dataset)

Yuchi Liu,Zhongdao Wang,Tom Gedeon,Liang Zheng

Due to the expensive data collection process, micro-expression datasets are generally much smaller in scale than those in other computer vision fields, rendering large-scale training less stable and feasible. In this paper, we aim to develop a protocol to automatically synthesize micro-expression training data that 1) are on a large scale and 2) allow us to train recognition models with strong accuracy on real-world test sets. Specifically, we discover three types of Action Units (AUs) that can well constitute trainable micro-expressions. These AUs come from real-world micro-expressions, early frames of macro-expressions, and the relationship between AUs and expression labels defined by human knowledge. With these AUs, our protocol then employs large numbers of face images with various identities and an existing face generation method for micro-expression synthesis. Micro-expression recognition models are trained on the generated micro-expression datasets and evaluated on real-world test sets, where very competitive and stable performance is obtained. The experimental results not only validate the effectiveness of these AUs and our dataset synthesis protocol but also reveal some critical properties of micro-expressions: they generalize across faces, are close to early-stage macro-expressions, and can be manually defined.

INFORMS · TAP · state-of-the-art · SimPLe · SOTA ·

2021 年 9 月 30 日

Deep Contextual Video Compression

Jiahao Li,Bin Li,Yan Lu

from arxiv, Accepted by NeurIPS 2021

Most of the existing neural video compression methods adopt the predictive coding framework, which first generates the predicted frame and then encodes its residue with the current frame. However, as for compression ratio, predictive coding is only a sub-optimal solution as it uses simple subtraction operation to remove the redundancy across frames. In this paper, we propose a deep contextual video compression framework to enable a paradigm shift from predictive coding to conditional coding. In particular, we try to answer the following questions: how to define, use, and learn condition under a deep video compression framework. To tap the potential of conditional coding, we propose using feature domain context as condition. This enables us to leverage the high dimension context to carry rich information to both the encoder and the decoder, which helps reconstruct the high-frequency contents for higher video quality. Our framework is also extensible, in which the condition can be flexibly designed. Experiments show that our method can significantly outperform the previous state-of-the-art (SOTA) deep video compression methods. When compared with x265 using veryslow preset, we can achieve 26.0% bitrate saving for 1080P standard test videos.

再縮放 · SR · Extensibility · 學成 · 損失 ·

2021 年 8 月 11 日

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling

Jingyun Liang,Andreas Lugmayr,Kai Zhang,Martin Danelljan,Luc Van Gool,Radu Timofte

from arxiv, Accepted by ICCV2021. Code: //github.com/JingyunLiang/HCFlow

Normalizing flows have recently demonstrated promising results for low-level vision tasks. For image super-resolution (SR), it learns to predict diverse photo-realistic high-resolution (HR) images from the low-resolution (LR) image rather than learning a deterministic mapping. For image rescaling, it achieves high accuracy by jointly modelling the downscaling and upscaling processes. While existing approaches employ specialized techniques for these two tasks, we set out to unify them in a single formulation. In this paper, we propose the hierarchical conditional flow (HCFlow) as a unified framework for image SR and image rescaling. More specifically, HCFlow learns a bijective mapping between HR and LR image pairs by modelling the distribution of the LR image and the rest high-frequency component simultaneously. In particular, the high-frequency component is conditional on the LR image in a hierarchical manner. To further enhance the performance, other losses such as perceptual loss and GAN loss are combined with the commonly used negative log-likelihood loss in training. Extensive experiments on general image SR, face image SR and image rescaling have demonstrated that the proposed HCFlow achieves state-of-the-art performance in terms of both quantitative metrics and visual quality.

視頻描述生成（Video Caption） · GROUP · 解碼 · Networking · contrastive ·

2021 年 2 月 3 日

Semantic Grouping Network for Video Captioning

Hobin Ryu,Sunghun Kang,Haeyong Kang,Chang D. Yoo

from arxiv, AAAI 2021

This paper considers a video caption generating network referred to as Semantic Grouping Network (SGN) that attempts (1) to group video frames with discriminating word phrases of partially decoded caption and then (2) to decode those semantically aligned groups in predicting the next word. As consecutive frames are not likely to provide unique information, prior methods have focused on discarding or merging repetitive information based only on the input video. The SGN learns an algorithm to capture the most discriminating word phrases of the partially decoded caption and a mapping that associates each phrase to the relevant video frames - establishing this mapping allows semantically related frames to be clustered, which reduces redundancy. In contrast to the prior methods, the continuous feedback from decoded words enables the SGN to dynamically update the video representation that adapts to the partially decoded caption. Furthermore, a contrastive attention loss is proposed to facilitate accurate alignment between a word phrase and video frames without manual annotations. The SGN achieves state-of-the-art performances by outperforming runner-up methods by a margin of 2.1%p and 2.4%p in a CIDEr-D score on MSVD and MSR-VTT datasets, respectively. Extensive experiments demonstrate the effectiveness and interpretability of the SGN.

深度學習 · Taxonomy · INFORMS · Performer · state-of-the-art ·

2020 年 12 月 20 日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Hongying Liu,Zhubo Ruan,Peng Zhao,Chao Dong,Fanhua Shang,Yuanyuan Liu,Linlin Yang

In recent years, deep learning has made great progress in many fields such as image recognition, natural language processing, speech recognition and video super-resolution. In this survey, we comprehensively investigate 33 state-of-the-art video super-resolution (VSR) methods based on deep learning. It is well known that the leverage of information within video frames is important for video super-resolution. Thus we propose a taxonomy and classify the methods into six sub-categories according to the ways of utilizing inter-frame information. Moreover, the architectures and implementation details of all the methods are depicted in detail. Finally, we summarize and compare the performance of the representative VSR method on some benchmark datasets. We also discuss some challenges, which need to be further addressed by researchers in the community of VSR. To the best of our knowledge, this work is the first systematic review on VSR tasks, and it is expected to make a contribution to the development of recent studies in this area and potentially deepen our understanding to the VSR techniques based on deep learning.

學成 · Networking · INFORMS · Performer · Neural Networks ·

2020 年 2 月 27 日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Jae Woong Soh,Sunwoo Cho,Nam Ik Cho

from arxiv, Will be presented in CVPR 2020

Convolutional neural networks (CNNs) have shown dramatic improvements in single image super-resolution (SISR) by using large-scale external samples. Despite their remarkable performance based on the external dataset, they cannot exploit internal information within a specific image. Another problem is that they are applicable only to the specific condition of data that they are supervised. For instance, the low-resolution (LR) image should be a "bicubic" downsampled noise-free image from a high-resolution (HR) one. To address both issues, zero-shot super-resolution (ZSSR) has been proposed for flexible internal learning. However, they require thousands of gradient updates, i.e., long inference time. In this paper, we present Meta-Transfer Learning for Zero-Shot Super-Resolution (MZSR), which leverages ZSSR. Precisely, it is based on finding a generic initial parameter that is suitable for internal learning. Thus, we can exploit both external and internal information, where one single gradient update can yield quite considerable results. (See Figure 1). With our method, the network can quickly adapt to a given image condition. In this respect, our method can be applied to a large spectrum of image conditions within a fast adaptation process.

SRGAN · ESRGAN · Networking · 生成式對抗網絡 · Better ·

2018 年 9 月 17 日

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

Xintao Wang,Ke Yu,Shixiang Wu,Jinjin Gu,Yihao Liu,Chao Dong,Chen Change Loy,Yu Qiao,Xiaoou Tang

from arxiv, To appear in ECCV 2018 workshop. Won Region 3 in the PIRM2018-SR Challenge. Code and models are at //github.com/xinntao/ESRGAN

The Super-Resolution Generative Adversarial Network (SRGAN) is a seminal work that is capable of generating realistic textures during single image super-resolution. However, the hallucinated details are often accompanied with unpleasant artifacts. To further enhance the visual quality, we thoroughly study three key components of SRGAN - network architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN). In particular, we introduce the Residual-in-Residual Dense Block (RRDB) without batch normalization as the basic network building unit. Moreover, we borrow the idea from relativistic GAN to let the discriminator predict relative realness instead of the absolute value. Finally, we improve the perceptual loss by using the features before activation, which could provide stronger supervision for brightness consistency and texture recovery. Benefiting from these improvements, the proposed ESRGAN achieves consistently better visual quality with more realistic and natural textures than SRGAN and won the first place in the PIRM2018-SR Challenge. The code is available at //github.com/xinntao/ESRGAN .

state-of-the-art · 對抗學習 · 學成 · 可理解性 · 查準率/準確率 ·

2018 年 8 月 20 日

Video-to-Video Synthesis

Ting-Chun Wang,Ming-Yu Liu,Jun-Yan Zhu,Guilin Liu,Andrew Tao,Jan Kautz,Bryan Catanzaro

from arxiv, Code, models, and more results are available at //github.com/NVIDIA/vid2vid

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. While its image counterpart, the image-to-image synthesis problem, is a popular topic, the video-to-video synthesis problem is less explored in the literature. Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality. In this paper, we propose a novel video-to-video synthesis approach under the generative adversarial learning framework. Through carefully-designed generator and discriminator architectures, coupled with a spatio-temporal adversarial objective, we achieve high-resolution, photorealistic, temporally coherent video results on a diverse set of input formats including segmentation masks, sketches, and poses. Experiments on multiple benchmarks show the advantage of our method compared to strong baselines. In particular, our model is capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long, which significantly advances the state-of-the-art of video synthesis. Finally, we apply our approach to future video prediction, outperforming several state-of-the-art competing systems.

INFORMS · 學成 · Re-ID · Extensibility · Performer ·

2018 年 2 月 22 日

Video Person Re-identification by Temporal Residual Learning

Ju Dai,Pingping Zhang,Huchuan Lu,Hongyu Wang

from arxiv, Submitted to IEEE Transactions on Image Processing, including 5 figures and 4 tables. The first two authors contribute equally to this work

In this paper, we propose a novel feature learning framework for video person re-identification (re-ID). The proposed framework largely aims to exploit the adequate temporal information of video sequences and tackle the poor spatial alignment of moving pedestrians. More specifically, for exploiting the temporal information, we design a temporal residual learning (TRL) module to simultaneously extract the generic and specific features of consecutive frames. The TRL module is equipped with two bi-directional LSTM (BiLSTM), which are respectively responsible to describe a moving person in different aspects, providing complementary information for better feature representations. To deal with the poor spatial alignment in video re-ID datasets, we propose a spatial-temporal transformer network (ST^2N) module. Transformation parameters in the ST^2N module are learned by leveraging the high-level semantic information of the current frame as well as the temporal context knowledge from other frames. The proposed ST^2N module with less learnable parameters allows effective person alignments under significant appearance changes. Extensive experimental results on the large-scale MARS, PRID2011, ILIDS-VID and SDU-VID datasets demonstrate that the proposed method achieves consistently superior performance and outperforms most of the very recent state-of-the-art methods.

可約的 · Networking · 推斷 · GAN · MoDELS ·

2018 年 1 月 29 日

tempoGAN: A Temporally Coherent, Volumetric GAN for Super-resolution Fluid Flow

You Xie,Erik Franz,Mengyu Chu,Nils Thuerey

from arxiv, submitted to SIGGRAPH 2018

We propose a temporally coherent generative model addressing the super-resolution problem for fluid flows. Our work represents a first approach to synthesize four-dimensional physics fields with neural networks. Based on a conditional generative adversarial network that is designed for the inference of three-dimensional volumetric data, our model generates consistent and detailed results by using a novel temporal discriminator, in addition to the commonly used spatial one. Our experiments show that the generator is able to infer more realistic high-resolution details by using additional physical quantities, such as low-resolution velocities or vorticities. Besides improvements in the training process and in the generated outputs, these inputs offer means for artistic control as well. We additionally employ a physics-aware data augmentation step, which is crucial to avoid overfitting and to reduce memory requirements. In this way, our network learns to generate advected quantities with highly detailed, realistic, and temporally coherent features. Our method works instantaneously, using only a single time-step of low-resolution fluid data. We demonstrate the abilities of our method using a variety of complex inputs and applications in two and three dimensions.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

state-of-the-art

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='bTP8G'><strong id='Kcumq'></strong><small id='DISFr'></small><button id='EaZFy'></button><li id='WsKhl'><noscript id='cuGQ6'><big id='hHSrc'></big><dt id='3e5ri'></dt></noscript></li></tr><ol id='942z8'><option id='aJQja'><table id='WaZuY'><blockquote id='TiHVD'><tbody id='qGNqi'></tbody></blockquote></table></option></ol><u id='61BDz'></u><kbd id='AgHwA'><kbd id='4Aljl'></kbd></kbd>

<code id='57G8a'><strong id='rOgw7'></strong></code>

<fieldset id='9aPh7'></fieldset>

<span id='6aE3M'></span>

<ins id='OSqtf'></ins>

<acronym id='LgbYi'><em id='wF4xe'></em><td id='RL7LT'><div id='6JzOE'></div></td></acronym><address id='B8TAO'><big id='tpyeN'><big id='vg7rp'></big><legend id='bwGtz'></legend></big></address>

<i id='6iCxw'><div id='8Sdz4'><ins id='fDCyH'></ins></div></i>

<i id='9xZlK'></i>