【一個完全無監督的框架,從噪聲和部分測量中學習圖像】Robust Equivariant Imaging: a fully unsupervised framework for learning to image from noisy and partial measurements
● 論文摘要:深度網絡在從醫學成像到計算攝影的多重成像逆問題中提供了最先進的性能。然而,大多數現有的網絡都是用干凈的信號訓練的,這通常很難或不可能獲得。等變成像(EI)是一種最新的自我監督學習框架,它利用信號分布中的群體不變性,僅從部分測量數據學習重構函數。雖然EI的結果令人印象深刻,但它的性能隨著噪聲的增加而下降。在本文中,我們提出了一個魯棒等變成像(REI)框架,它可以學習圖像從噪聲部分測量單獨。該方法使用Stein’s Unbiased Risk Estimator (SURE)來獲得對噪聲具有魯棒性的完全無監督訓練損失。我們表明,REI在線性和非線性逆問題上帶來了可觀的性能增益,從而為深度網絡的魯棒無監督成像鋪平了道路。
● 論文鏈接://arxiv.org/abs/2111.12855
● 論文代碼:
● 作者單位:愛丁堡大學
Learning the Degradation Distribution for Blind Image Super-Resolution
Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan
當前的超分方法大多采用合成的成對的高清-低清樣本來訓練模型。為了避免合成數據與真實數據之間存在域差異,之前大部分方法采用可學習的退化模型去自適應地生成合成數據。這些降質模型通常是確定性的(deterministic),即一張高清圖片只能用來合成一張低清樣本。然而,真實場景中的退化方法通常是隨機的,比如相機抖動造成的模糊和隨機噪聲。確定性的退化模型很難模擬真實退化方法的隨機性。針對這一問題,本文提出一種概率(probabilistic)退化模型。該模型把退化當作隨機變量進行研究,并通過學習從預定義的隨機變量到退化方法的映射來建模其分布。和以往的確定性退化模型相比,我們的概率退化模型可以模擬更加多樣的退化方法,從而生成更加豐富的高清-低清訓練樣本對,來幫助訓練更加魯棒的超分模型。在不同的數據集上的大量實驗表明,我們的方法可以幫助超分模型在復雜降質環境中取得更好的結果。
基于概率退化模型的盲超分模型結構圖
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.
來自UIUC最新《自監督學習》教程,
題目: Online Deep Clustering for Unsupervised Representation Learning
摘要:
聯合聚類和特征學習方法在無監督表示學習中表現出了顯著的效果。但是,特征聚類和網絡參數更新訓練計劃的交替導致視覺表征學習的不穩定。為了克服這個挑戰,我們提出在線深度集群(ODC),它可以同時執行集群和網絡更新,而不是交替進行。關鍵見解是,聚類中心應該穩步發展,以保持分類器的穩定更新。具體來說,設計和維護了兩個動態內存模塊,即樣本記憶用于存儲樣本標簽和特征,中心記憶用于中心進化。我們將全局聚類分解為穩定的內存更新和成批的標簽重新分配。該過程被集成到網絡更新迭代中。通過這種方式,標簽和網絡齊頭并進,而不是交替發展。大量的實驗表明,ODC能夠穩定訓練過程,有效地提高訓練性能。
自監督式VO方法在視頻中聯合估計攝像機姿態和深度方面取得了很大的成功。然而,與大多數數據驅動的方法一樣,現有的VO網絡在面對與訓練數據不同的場景時,性能顯著下降,不適合實際應用。在本文中,我們提出了一種在線元學習算法,使VO網絡能夠以一種自監督的方式不斷適應新的環境。該方法利用卷積長短時記憶(convLSTM)來聚合過去的豐富時空信息。網絡能夠記憶和學習過去的經驗,以便更好地估計和快速適應當前幀。在開放環境中運行VO時,為了應對環境的變化,我們提出了一種在線的特征對齊方法,即在不同的時刻對特征分布進行對齊。我們的VO網絡能夠無縫地適應不同的環境。在看不見的戶外場景、虛擬到真實世界和戶外到室內環境的大量實驗表明,我們的方法始終比最先進的自監督的VO基線性能更好。
題目: Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
摘要: 圖像級弱監督語義分割是近年來深入研究的一個具有挑戰性的問題。大多數高級解決方案都利用類激活映射(CAM)。然而,由于監督的充分性和弱監督的差距,CAMs很難作為目標掩模。在這篇論文中,我們提出了一個自我監督的等變注意機制(SEAM)來發現額外的監督并縮小差距。我們的方法是基于等方差是完全監督語義分割的一個隱含約束,其像素級標簽在數據擴充過程中與輸入圖像進行相同的空間變換。然而,這種約束在圖像級監控訓練的凸輪上丟失了。因此,我們提出了對不同變換圖像的預測凸輪進行一致性正則化,為網絡學習提供自監督。此外,我們提出了一個像素相關模塊(PCM),它利用上下文外觀信息,并改進當前像素的預測由其相似的鄰居,從而進一步提高CAMs的一致性。在PASCAL VOC 2012數據集上進行的大量實驗表明,我們的方法在同等監督水平下表現優于最先進的方法。
自監督學習(Self-Supervised Learning)是一種介于無監督和監督學習之間的一種新范式,旨在減少對大量帶注釋數據的挑戰性需求。它通過定義無注釋(annotation-free)的前置任務(pretext task),為特征學習提供代理監督信號。jason718整理了關于自監督學習最新的論文合集,非常值得查看!
地址: //github.com/jason718/awesome-self-supervised-learning
A curated list of awesome Self-Supervised Learning resources. Inspired by , , , and
Self-Supervised Learning has become an exciting direction in AI community.
Please help contribute this list by contacting or add
Markdown format:
- Paper Name.
[[pdf]](link)
[[code]](link)
- Author 1, Author 2, and Author 3. *Conference Year*
FAIR Self-Supervision Benchmark : various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches.
Unsupervised Visual Representation Learning by Context Prediction.
Unsupervised Learning of Visual Representations using Videos.
Learning to See by Moving.
Learning image representations tied to ego-motion.
Joint Unsupervised Learning of Deep Representations and Image Clusters.
Unsupervised Deep Embedding for Clustering Analysis.
Slow and steady feature analysis: higher order temporal coherence in video.
Context Encoders: Feature Learning by Inpainting.
Colorful Image Colorization.
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles.
Ambient Sound Provides Supervision for Visual Learning.
Learning Representations for Automatic Colorization.
Unsupervised Visual Representation Learning by Graph-based Consistent Constraints.
Adversarial Feature Learning.
Self-supervised learning of visual features through embedding images into text topic spaces.
Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction.
Learning Features by Watching Objects Move.
Colorization as a Proxy Task for Visual Understanding.
DeepPermNet: Visual Permutation Learning.
Unsupervised Learning by Predicting Noise.
Multi-task Self-Supervised Visual Learning.
Representation Learning by Learning to Count.
Transitive Invariance for Self-supervised Visual Representation Learning.
Look, Listen and Learn.
Unsupervised Representation Learning by Sorting Sequences.
Unsupervised Feature Learning via Non-parameteric Instance Discrimination
Learning Image Representations by Completing Damaged Jigsaw Puzzles.
Unsupervised Representation Learning by Predicting Image Rotations.
Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization.
Improvements to context based self-supervised learning.
Self-Supervised Feature Learning by Learning to Spot Artifacts.
Boosting Self-Supervised Learning via Knowledge Transfer.
Cross-domain Self-supervised Multi-task Feature Learning Using Synthetic Imagery.
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids.
Deep Clustering for Unsupervised Learning of Visual Features
Cross Pixel Optical-Flow Similarity for Self-Supervised Learning.
Representation Learning with Contrastive Predictive Coding.
Self-Supervised Learning via Conditional Motion Propagation.
Self-Supervised Representation Learning by Rotation Feature Decoupling.
Revisiting Self-Supervised Visual Representation Learning.
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data.
Unsupervised Deep Learning by Neighbourhood Discovery. . .
Contrastive Multiview Coding.
Large Scale Adversarial Representation Learning.
Learning Representations by Maximizing Mutual Information Across Views.
Selfie: Self-supervised Pretraining for Image Embedding.
Data-Efficient Image Recognition with Contrastive Predictive Coding
Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty
Boosting Few-Shot Visual Learning with Self-Supervision
Self-Supervised Generalisation with Meta Auxiliary Learning
Wasserstein Dependency Measure for Representation Learning
Scaling and Benchmarking Self-Supervised Visual Representation Learning
A critical analysis of self-supervision, or what we can learn from a single image
On Mutual Information Maximization for Representation Learning
Understanding the Limitations of Variational Mutual Information Estimators
Automatic Shortcut Removal for Self-Supervised Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
A Simple Framework for Contrastive Learning of Visual Representations
ClusterFit: Improving Generalization of Visual Representations
Self-Supervised Learning of Pretext-Invariant Representations
Unsupervised Learning of Video Representations using LSTMs.
Shuffle and Learn: Unsupervised Learning using Temporal Order Verification.
LSTM Self-Supervision for Detailed Behavior Analysis
Self-Supervised Video Representation Learning With Odd-One-Out Networks.
Unsupervised Learning of Long-Term Motion Dynamics for Videos.
Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning.
Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning.
Self-supervised learning of a facial attribute embedding from video.
Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles.
Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics.
DynamoNet: Dynamic Action and Motion Network.
Learning Correspondence from the Cycle-consistency of Time.
Joint-task Self-supervised Learning for Temporal Correspondence.
Self-supervised Learning of Motion Capture.
Unsupervised Learning of Depth and Ego-Motion from Video.
Active Stereo Net: End-to-End Self-Supervised Learning for Active Stereo Systems.
Self-Supervised Relative Depth Learning for Urban Scene Understanding.
Geometry-Aware Learning of Maps for Camera Localization.
Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection.
Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry.
SelFlow: Self-Supervised Learning of Optical Flow.
Unsupervised Learning of Landmarks by Descriptor Vector Exchange.
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features.
Objects that Sound.
Learning to Separate Object Sounds by Watching Unlabeled Video.
The Sound of Pixels.
Learnable PINs: Cross-Modal Embeddings for Person Identity.
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization.
Self-Supervised Generation of Spatial Audio for 360° Video.
TriCycle: Audio Representation Learning from Sensor Network Data Using Self-Supervision
Self-taught Learning: Transfer Learning from Unlabeled Data.
Representation Learning: A Review and New Perspectives.
Curiosity-driven Exploration by Self-supervised Prediction.
Large-Scale Study of Curiosity-Driven Learning.
Playing hard exploration games by watching YouTube.
Unsupervised State Representation Learning in Atari.
Improving Robot Navigation Through Self-Supervised Online Learning
Reverse Optical Flow for Self-Supervised Adaptive Autonomous Robot Navigation
Online self-supervised learning for dynamic object segmentation
Self-Supervised Online Learning of Basic Object Push Affordances
Self-supervised learning of grasp dependent tool affordances on the iCub Humanoid robot
Persistent self-supervised learning principle: from stereo to monocular vision for obstacle avoidance
The Curious Robot: Learning Visual Representations via Physical Interactions.
Learning to Poke by Poking: Experiential Learning of Intuitive Physics.
Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours.
Supervision via Competition: Robot Adversaries for Learning Tasks.
Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge.
Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.
Learning to Fly by Crashing
Self-supervised learning as an enabling technology for future space exploration robots: ISS experiments on monocular distance learning
Unsupervised Perceptual Rewards for Imitation Learning.
Self-Supervised Visual Planning with Temporal Skip Connections.
CASSL: Curriculum Accelerated Self-Supervised Learning.
Time-Contrastive Networks: Self-Supervised Learning from Video.
Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation.
Learning Actionable Representations from Visual Observations.
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning.
Visual Reinforcement Learning with Imagined Goals.
Grasp2Vec: Learning Object Representations from Self-Supervised Grasping.
Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning.
Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry.
Learning Latent Plans from Play.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
Self-Supervised Dialogue Learning
Self-Supervised Learning for Contextualized Extractive Summarization
A Mutual Information Maximization Perspective of Language Representation Learning
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Learning Robust and Multilingual Speech Representations
Unsupervised pretraining transfers well across languages
wav2vec: Unsupervised Pre-Training for Speech Recognition
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Effectiveness of self-supervised pre-training for speech recognition
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Self-Training for End-to-End Speech Recognition
Generative Pre-Training for Speech with Autoregressive Predictive Coding
To the extent possible under law, has waived all copyright and related or neighboring rights to this work.
論文主題: Deep Learning for Image Super-resolution: A Survey
論文摘要: 圖像超分辨率(SR)是提高圖像分辨率的一類重要的圖像處理技術以及計算機視覺中的視頻。近年來,基于深度學習的圖像超分辨率研究取得了顯著進展技術。在這項調查中,我們旨在介紹利用深度學習的圖像超分辨率技術的最新進展系統的方法。一般來說,我們可以粗略地將現有的SR技術研究分為三大類:監督SR、非監督SR和領域特定SR。此外,我們還討論了一些其他重要問題,如公開可用的基準數據集和性能評估指標。最后,我們通過強調幾個未來來結束這項調查未來社區應進一步解決的方向和公開問題.