顾美玲国产一区二区三区-日本一区二区三区不卡网站

題目： MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

摘要: 盡管人們對無監督學習越來越感興趣，但從無標簽的音頻中提取有意義的知識仍然是一個公開的挑戰。為了在這個方向上邁出一步，我們最近提出了一個問題不可知的語音編碼器(PASE)，它結合了一個卷積編碼器和多個神經網絡，稱為workers，其任務是解決自監督的問題，不需要手動注釋的真值。PASE證明能夠捕捉相關的語音信息，包括說話者的聲紋和音素。本文提出了一種改進的PASE+，用于在噪聲和混響環境下進行魯棒語音識別。為此，我們使用了一個在線語音失真模塊，它用各種隨機干擾來污染輸入信號。然后，我們提出一種改進的編碼器，更好地學習短期和長期語音動態與遞歸網絡和卷積網絡的有效結合。最后，我們完善了用于自監督的workers，以鼓勵更好的合作。

TIMIT、DIRHA和CHiME-5的結果表明，PASE+ sig-明顯優于之前版本的PASE以及常見的聲學特性。有趣的是，PASE+學習適用于高度不匹配的聲學條件的可轉移特征。

付費5元查看完整內容

Contributing

We Need You!

Please help contribute this list by contacting or add

Markdown format:

- Paper Name. 
  [[pdf]](link) 
  [[code]](link)
  - Author 1, Author 2, and Author 3. *Conference Year*

Computer Vision (CV)
Machine Learning
- Reinforcement Learning
Robotics
Natural Language Processing (NLP)
Automatic Speech Recognition (ASR)
Talks
Thesis
Blog

Computer Vision

Survey

Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey.
- Longlong Jing and Yingli Tian.

Image Representation Learning

Benchmark code

FAIR Self-Supervision Benchmark : various benchmark (and legacy) tasks for evaluating quality of visual representations learned by various self-supervision approaches.

2015

Unsupervised Visual Representation Learning by Context Prediction.
- Doersch, Carl and Gupta, Abhinav and Efros, Alexei A. ICCV 2015
Unsupervised Learning of Visual Representations using Videos.
- Wang, Xiaolong and Gupta, Abhinav. ICCV 2015
Learning to See by Moving.
- Agrawal, Pulkit and Carreira, Joao and Malik, Jitendra. ICCV 2015
Learning image representations tied to ego-motion.
- Jayaraman, Dinesh and Grauman, Kristen. ICCV 2015

2016

Joint Unsupervised Learning of Deep Representations and Image Clusters.
- Jianwei Yang, Devi Parikh, Dhruv Batra. CVPR 2016
Unsupervised Deep Embedding for Clustering Analysis.
- Junyuan Xie, Ross Girshick, and Ali Farhadi. ICML 2016
Slow and steady feature analysis: higher order temporal coherence in video.
- Jayaraman, Dinesh and Grauman, Kristen. CVPR 2016
Context Encoders: Feature Learning by Inpainting.
- Pathak, Deepak and Krahenbuhl, Philipp and Donahue, Jeff and Darrell, Trevor and Efros, Alexei A. CVPR 2016
Colorful Image Colorization.
- Zhang, Richard and Isola, Phillip and Efros, Alexei A. ECCV 2016
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles.
- Noroozi, Mehdi and Favaro, Paolo. ECCV 2016
Ambient Sound Provides Supervision for Visual Learning.
- Owens, Andrew and Wu, Jiajun and McDermott, Josh and Freeman, William and Torralba, Antonio. ECCV 2016
Learning Representations for Automatic Colorization.
- Larsson, Gustav and Maire, Michael and Shakhnarovich, Gregory. ECCV 2016
Unsupervised Visual Representation Learning by Graph-based Consistent Constraints.
- Li, Dong and Hung, Wei-Chih and Huang, Jia-Bin and Wang, Shengjin and Ahuja, Narendra and Yang, Ming-Hsuan. ECCV 2016

2017

Adversarial Feature Learning.
- Donahue, Jeff and Krahenbuhl, Philipp and Darrell, Trevor. ICLR 2017
Self-supervised learning of visual features through embedding images into text topic spaces.
- L. Gomez* and Y. Patel* and M. Rusi?ol and D. Karatzas and C.V. Jawahar. CVPR 2017
Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction.
- Zhang, Richard and Isola, Phillip and Efros, Alexei A. CVPR 2017
Learning Features by Watching Objects Move.
- Pathak, Deepak and Girshick, Ross and Dollar, Piotr and Darrell, Trevor and Hariharan, Bharath. CVPR 2017
Colorization as a Proxy Task for Visual Understanding.
- Larsson, Gustav and Maire, Michael and Shakhnarovich, Gregory. CVPR 2017
DeepPermNet: Visual Permutation Learning.
- Cruz, Rodrigo Santa and Fernando, Basura and Cherian, Anoop and Gould, Stephen. CVPR 2017
Unsupervised Learning by Predicting Noise.
- Bojanowski, Piotr and Joulin, Armand. ICML 2017
Multi-task Self-Supervised Visual Learning.
- Doersch, Carl and Zisserman, Andrew. ICCV 2017
Representation Learning by Learning to Count.
- Noroozi, Mehdi and Pirsiavash, Hamed and Favaro, Paolo. ICCV 2017
Transitive Invariance for Self-supervised Visual Representation Learning.
- Wang, Xiaolong and He, Kaiming and Gupta, Abhinav. ICCV 2017
Look, Listen and Learn.
- Relja, Arandjelovic and Zisserman, Andrew. ICCV 2017
Unsupervised Representation Learning by Sorting Sequences.
- Hsin-Ying Lee, Jia-Bin Huang, Maneesh Kumar Singh, and Ming-Hsuan Yang. ICCV 2017

2018

Unsupervised Feature Learning via Non-parameteric Instance Discrimination
- Zhirong Wu, Yuanjun Xiong and X Yu Stella and Dahua Lin. CVPR 2018
Learning Image Representations by Completing Damaged Jigsaw Puzzles.
- Kim, Dahun and Cho, Donghyeon and Yoo, Donggeun and Kweon, In So. WACV 2018
Unsupervised Representation Learning by Predicting Image Rotations.
- Spyros Gidaris and Praveer Singh and Nikos Komodakis. ICLR 2018
Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization.
- Ozsel Kilinc and Ismail Uysal. ICLR 2018
Improvements to context based self-supervised learning.
- Terrell Mundhenk and Daniel Ho and Barry Chen. CVPR 2018
Self-Supervised Feature Learning by Learning to Spot Artifacts.
- Simon Jenni and Universit?t Bern and Paolo Favaro. CVPR 2018
Boosting Self-Supervised Learning via Knowledge Transfer.
- Mehdi Noroozi and Ananth Vinjimoor and Paolo Favaro and Hamed Pirsiavash. CVPR 2018
Cross-domain Self-supervised Multi-task Feature Learning Using Synthetic Imagery.
- Zhongzheng Ren and Yong Jae Lee. CVPR 2018
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids.
- Dinesh Jayaraman*, UC Berkeley; Ruohan Gao, University of Texas at Austin; Kristen Grauman. ECCV 2018
Deep Clustering for Unsupervised Learning of Visual Features
- Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze. ECCV 2018
Cross Pixel Optical-Flow Similarity for Self-Supervised Learning.
- Aravindh Mahendran, James Thewlis, Andrea Vedaldi. ACCV 2018

2019

Representation Learning with Contrastive Predictive Coding.
- Aaron van den Oord, Yazhe Li, Oriol Vinyals.
Self-Supervised Learning via Conditional Motion Propagation.
- Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, and Chen Change Loy. CVPR 2019
Self-Supervised Representation Learning by Rotation Feature Decoupling.
- Zeyu Feng; Chang Xu; Dacheng Tao. CVPR 2019
Revisiting Self-Supervised Visual Representation Learning.
- Alexander Kolesnikov; Xiaohua Zhai; Lucas Beye. CVPR 2019
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data.
- Liheng Zhang, Guo-Jun Qi, Liqiang Wang, Jiebo Luo. CVPR 2019
Unsupervised Deep Learning by Neighbourhood Discovery. . .
- Jiabo Huang, Qi Dong, Shaogang Gong, Xiatian Zhu. ICML 2019
Contrastive Multiview Coding.
- Yonglong Tian and Dilip Krishnan and Phillip Isola.
Large Scale Adversarial Representation Learning.
- Jeff Donahue, Karen Simonyan.
Learning Representations by Maximizing Mutual Information Across Views.
- Philip Bachman, R Devon Hjelm, William Buchwalter
Selfie: Self-supervised Pretraining for Image Embedding.
- Trieu H. Trinh, Minh-Thang Luong, Quoc V. Le
Data-Efficient Image Recognition with Contrastive Predictive Coding
- Olivier J. He ?naff, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord
Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty
- Dan Hendrycks, Mantas Mazeika, Saurav Kadavath, Dawn Song. NeurIPS 2019
Boosting Few-Shot Visual Learning with Self-Supervision
- pyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, and Matthieu Cord. ICCV 2019
Self-Supervised Generalisation with Meta Auxiliary Learning
- Shikun Liu, Andrew J. Davison, Edward Johns. NeurIPS 2019
Wasserstein Dependency Measure for Representation Learning
- Sherjil Ozair, Corey Lynch, Yoshua Bengio, Aaron van den Oord, Sergey Levine, Pierre Sermanet. NeurIPS 2019
Scaling and Benchmarking Self-Supervised Visual Representation Learning
- Priya Goyal, Dhruv Mahajan, Abhinav Gupta, Ishan Misra. ICCV 2019

2020

A critical analysis of self-supervision, or what we can learn from a single image
- Yuki M. Asano, Christian Rupprecht, Andrea Vedaldi. ICLR 2020
On Mutual Information Maximization for Representation Learning
- Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic. ICLR 2020
Understanding the Limitations of Variational Mutual Information Estimators
- Jiaming Song, Stefano Ermon. ICLR 2020
Automatic Shortcut Removal for Self-Supervised Representation Learning
- Matthias Minderer, Olivier Bachem, Neil Houlsby, Michael Tschannen
Momentum Contrast for Unsupervised Visual Representation Learning
- Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, Ross Girshick. FAIR
A Simple Framework for Contrastive Learning of Visual Representations
- Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton
ClusterFit: Improving Generalization of Visual Representations
- Xueting Yan*, Ishan Misra*, Abhinav Gupta, Deepti Ghadiyaram**, Dhruv Mahajan**. CVPR 2020
Self-Supervised Learning of Pretext-Invariant Representations
- Ishan Misra, Laurens van der Maaten. CVPR 2020

Video Representation Learning

Unsupervised Learning of Video Representations using LSTMs.
- Srivastava, Nitish and Mansimov, Elman and Salakhudinov, Ruslan. ICML 2015
Shuffle and Learn: Unsupervised Learning using Temporal Order Verification.
- Ishan Misra, C. Lawrence Zitnick and Martial Hebert. ECCV 2016
LSTM Self-Supervision for Detailed Behavior Analysis
- Biagio Brattoli*, Uta Büchler*, Anna-Sophia Wahl, Martin E. Schwab, and Bj?rn Ommer. CVPR 2017
Self-Supervised Video Representation Learning With Odd-One-Out Networks.
- Basura Fernando and Hakan Bilen and Efstratios Gavves and Stephen Gould. CVPR 2017
Unsupervised Learning of Long-Term Motion Dynamics for Videos.
- Luo, Zelun and Peng, Boya and Huang, De-An and Alahi, Alexandre and Fei-Fei, Li. CVPR 2017
Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning.
- Chuang Gan and Boqing Gong and Kun Liu and Hao Su and Leonidas J. Guibas. CVPR 2018
Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning.
- Biagio Brattoli*, Uta Büchler*, and Bj?rn Ommer. ECCV 2018
Self-supervised learning of a facial attribute embedding from video.
- Wiles, O., Koepke, A.S., Zisserman, A. BMVC 2018
Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles.
- Kim, Dahun and Cho, Donghyeon and Yoo, Donggeun and Kweon, In So. AAAI 2019
Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics.
- Jiangliu Wang; Jianbo Jiao; Linchao Bao; Shengfeng He; Yunhui Liu; Wei Liu. CVPR 2019
DynamoNet: Dynamic Action and Motion Network.
- Ali Diba; Vivek Sharma, Luc Van Gool, Rainer Stiefelhagen. ICCV 2019
Learning Correspondence from the Cycle-consistency of Time.
- Xiaolong Wang*, Allan Jabri* and Alexei A. Efros. CVPR 2019
Joint-task Self-supervised Learning for Temporal Correspondence.
- Xueting Li*, Sifei Liu*, Shalini De Mello, Xiaolong Wang, Jan Kautz, and Ming-Hsuan Yang. NIPS 2019

Geometry

Self-supervised Learning of Motion Capture.
- Tung, Hsiao-Yu and Tung, Hsiao-Wei and Yumer, Ersin and Fragkiadaki, Katerina. NIPS 2017
Unsupervised Learning of Depth and Ego-Motion from Video.
- Zhou, Tinghui and Brown, Matthew and Snavely, Noah and Lowe, David G. CVPR 2017
Active Stereo Net: End-to-End Self-Supervised Learning for Active Stereo Systems.
- Yinda Zhang*, Sean Fanello, Sameh Khamis, Christoph Rhemann, Julien Valentin, Adarsh Kowdle, Vladimir Tankovich, Shahram Izadi, Thomas Funkhouser. ECCV 2018
Self-Supervised Relative Depth Learning for Urban Scene Understanding.
- Huaizu Jiang*, Erik Learned-Miller, Gustav Larsson, Michael Maire, Greg Shakhnarovich. ECCV 2018
Geometry-Aware Learning of Maps for Camera Localization.
- Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, and Jan Kautz. CVPR 2018
Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection.
- David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi. CVPR 2018
Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry.
- Muhammed Kocabas; Salih Karagoz; Emre Akbas. CVPR 2019
SelFlow: Self-Supervised Learning of Optical Flow.
- Jiangliu Wang; Jianbo Jiao; Linchao Bao; Shengfeng He; Yunhui Liu; Wei Liu. CVPR 2019
Unsupervised Learning of Landmarks by Descriptor Vector Exchange.
- James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi. ICCV 2019

Audio

Audio-Visual Scene Analysis with Self-Supervised Multisensory Features.
- Andrew Owens, Alexei A. Efros. ECCV 2018
Objects that Sound.
- R. Arandjelovi?, A. Zisserman. ECCV 2018
Learning to Separate Object Sounds by Watching Unlabeled Video.
- Ruohan Gao, Rogerio Feris, Kristen Grauman. ECCV 2018
The Sound of Pixels.
- Zhao, Hang and Gan, Chuang and Rouditchenko, Andrew and Vondrick, Carl and McDermott, Josh and Torralba, Antonio. ECCV 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity.
- Arsha Nagrani, Samuel Albanie, Andrew Zisserman. ECCV 2018
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization.
- Bruno Korbar,Dartmouth College, Du Tran, Lorenzo Torresani. NIPS 2018
Self-Supervised Generation of Spatial Audio for 360° Video.
- Pedro Morgado, Nuno Nvasconcelos, Timothy Langlois, Oliver Wang. NIPS 2018
TriCycle: Audio Representation Learning from Sensor Network Data Using Self-Supervision
- Mark Cartwright, Jason Cramer, Justin Salamon, Juan Pablo Bello. WASPAA 2019

Others

Self-learning Scene-specific Pedestrian Detectors using a Progressive Latent Model.
- Qixiang Ye, Tianliang Zhang, Qiang Qiu, Baochang Zhang, Jie Chen, Guillermo Sapiro. CVPR 2017
Free Supervision from Video Games.
- Philipp Kr?henbühl. CVPR 2018
Fighting Fake News: Image Splice Detection via Learned Self-Consistency
- Minyoung Huh*, Andrew Liu*, Andrew Owens, Alexei A. Efros. ECCV 2018
Self-supervised Tracking by Colorization (Tracking Emerges by Colorizing Videos).
- Carl Vondrick*, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, Kevin Murphy. ECCV 2018
High-Fidelity Image Generation With Fewer Labels.
- Mario Lucic*, Michael Tschannen*, Marvin Ritter*, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly.
Self-supervised Fitting of Articulated Meshes to Point Clouds.
- Chun-Liang Li, Tomas Simon, Jason Saragih, Barnabás Póczos and Yaser Sheikh. CVPR 2019
SCOPS: Self-Supervised Co-Part Segmentation.
- Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, and Jan Kautz. CVPR 2019
Self-Supervised GANs via Auxiliary Rotation Loss.
- Ting Chen; Xiaohua Zhai; Marvin Ritter; Mario Lucic; Neil Houlsby. CVPR 2019
Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking.
- Jae Shin Yoon; Takaaki Shiratori; Shoou-I Yu; Hyun Soo Park. CVPR 2019
Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations.
- Wonhee Lee; Joonil Na; Gunhee Kim. CVPR 2019
Self-Supervised Convolutional Subspace Clustering Network.
- Junjian Zhang; Chun-Guang Li; Chong You; Xianbiao Qi; Honggang Zhang; Jun Guo; Zhouchen Lin. CVPR 2019
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation.
- Xin Wang; Qiuyuan Huang; Asli Celikyilmaz; Jianfeng Gao; Dinghan Shen; Yuan-Fang Wang; William Yang Wang; Lei Zhang. CVPR 2019
Unsupervised 3D Pose Estimation With Geometric Self-Supervision.
- Ching-Hang Chen; Ambrish Tyagi; Amit Agrawal; Dylan Drover; Rohith MV; Stefan Stojanov; James M. Rehg. CVPR 2019
Learning to Generate Grounded Image Captions without Localization Supervision.
- Chih-Yao Ma; Yannis Kalantidis; Ghassan AlRegib; Peter Vajda; Marcus Rohrbach; Zsolt Kira.
VideoBERT: A Joint Model for Video and Language Representation Learning
- Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, Cordelia Schmid. ICCV 2019
S4L: Self-Supervised Semi-Supervised Learning
- Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, Lucas Beyer
Countering Noisy Labels By Learning From Auxiliary Clean Labels
- Tsung Wei Tsai, Chongxuan Li, Jun Zhu

Machine Learning

Self-taught Learning: Transfer Learning from Unlabeled Data.
- Raina, Rajat and Battle, Alexis and Lee, Honglak and Packer, Benjamin and Ng, Andrew Y. ICML 2007
Representation Learning: A Review and New Perspectives.
- Bengio, Yoshua and Courville, Aaron and Vincent, Pascal. TPAMI 2013.

Reinforcement Learning

Curiosity-driven Exploration by Self-supervised Prediction.
- Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, and Trevor Darrell. ICML 2017
Large-Scale Study of Curiosity-Driven Learning.
- Yuri Burda*, Harri Edwards*, Deepak Pathak*, Amos Storkey, Trevor Darrell and Alexei A. Efros
Playing hard exploration games by watching YouTube.
- Yusuf Aytar, Tobias Pfaff, David Budden, Tom Le Paine, Ziyu Wang, Nando de Freitas. NIPS 2018
Unsupervised State Representation Learning in Atari.
- Ankesh Anand, Evan Racah, Sherjil Ozair, Yoshua Bengio, Marc-Alexandre C?té, R Devon Hjelm. NeurIPS 2019

Robotics

2006

Improving Robot Navigation Through Self-Supervised Online Learning
- Boris Sofman, Ellie Lin, J. Andrew Bagnell, Nicolas Vandapel, and Anthony Stentz
Reverse Optical Flow for Self-Supervised Adaptive Autonomous Robot Navigation
- A. Lookingbill, D. Lieb, J. Rogers and J. Curry

2009

Learning Long-Range Vision for Autonomous Off-Road Driving
- Raia Hadsell, Pierre Sermanet, Jan Ben, Ayse Erkan, Marco Scoffier, Koray Kavukcuoglu, Urs Muller, Yann LeCun

2012

Self-supervised terrain classification for planetary surface exploration rovers
- Christopher A. Brooks, Karl Iagnemma

2014

Terrain Traversability Analysis Using Multi-Sensor Data Correlation by a Mobile Robot
- Mohammed Abdessamad Bekhti, Yuichi Kobayashi and Kazuki Matsumura

2015

Online self-supervised learning for dynamic object segmentation
- Vitor Guizilini and Fabio Ramos, The International Journal of Robotics Research
Self-Supervised Online Learning of Basic Object Push Affordances
- Barry Ridge, Ales Leonardis, Ales Ude, Miha Denisa, and Danijel Skocaj
Self-supervised learning of grasp dependent tool affordances on the iCub Humanoid robot
- Tanis Mar, Vadim Tikhanoff, Giorgio Metta, and Lorenzo Natale

2016

Persistent self-supervised learning principle: from stereo to monocular vision for obstacle avoidance
- Kevin van Hecke, Guido de Croon, Laurens van der Maaten, Daniel Hennes, and Dario Izzo
The Curious Robot: Learning Visual Representations via Physical Interactions.
- Lerrel Pinto and Dhiraj Gandhi and Yuanfeng Han and Yong-Lae Park and Abhinav Gupta. ECCV 2016
Learning to Poke by Poking: Experiential Learning of Intuitive Physics.
- Agrawal, Pulkit and Nair, Ashvin V and Abbeel, Pieter and Malik, Jitendra and Levine, Sergey. NIPS 2016
Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours.
- Pinto, Lerrel and Gupta, Abhinav. ICRA 2016

2017

Supervision via Competition: Robot Adversaries for Learning Tasks.
- Pinto, Lerrel and Davidson, James and Gupta, Abhinav. ICRA 2017
Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge.
- Andy Zeng, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker Jr., Alberto Rodriguez, Jianxiong Xiao. ICRA 2017
Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.
- Ashvin Nair*, Dian Chen*, Pulkit Agrawal*, Phillip Isola, Pieter Abbeel, Jitendra Malik, Sergey Levine. ICRA 2017
Learning to Fly by Crashing
- Dhiraj Gandhi, Lerrel Pinto, Abhinav Gupta IROS 2017
Self-supervised learning as an enabling technology for future space exploration robots: ISS experiments on monocular distance learning
- K. van Hecke, G. C. de Croon, D. Hennes, T. P. Setterfield, A. Saenz- Otero, and D. Izzo
Unsupervised Perceptual Rewards for Imitation Learning.
- Sermanet, Pierre and Xu, Kelvin and Levine, Sergey. RSS 2017
Self-Supervised Visual Planning with Temporal Skip Connections.
- Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine. CoRL2017

2018

CASSL: Curriculum Accelerated Self-Supervised Learning.
- Adithyavairavan Murali, Lerrel Pinto, Dhiraj Gandhi, Abhinav Gupta. ICRA 2018
Time-Contrastive Networks: Self-Supervised Learning from Video.
- Pierre Sermanet and Corey Lynch and Yevgen Chebotar and Jasmine Hsu and Eric Jang and Stefan Schaal and Sergey Levine. ICRA 2018
Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation.
- Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine. ICRA 2018
Learning Actionable Representations from Visual Observations.
- Dwibedi, Debidatta and Tompson, Jonathan and Lynch, Corey and Sermanet, Pierre. IROS 2018
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning.
- Andy Zeng, Shuran Song, Stefan Welker, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser. IROS 2018
Visual Reinforcement Learning with Imagined Goals.
- Ashvin Nair*, Vitchyr Pong*, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine.NeurIPS 2018
Grasp2Vec: Learning Object Representations from Self-Supervised Grasping.
- Eric Jang*, Coline Devin*, Vincent Vanhoucke, Sergey Levine. CoRL 2018
Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning.
- Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn. CoRL 2018

2019

Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry.
- Mirko Nava, Jerome Guzzi, R. Omar Chavez-Garcia, Luca M. Gambardella, Alessandro Giusti. Robotics and Automation Letters
Learning Latent Plans from Play.
- Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet

2020

Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video.
- Oier Mees, Markus Merklinger, Gabriel Kalweit, Wolfram Burgard ICRA 2020

NLP

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. NAACL 2019 Best Long Paper
Self-Supervised Dialogue Learning
- Jiawei Wu, Xin Wang, William Yang Wang. ACL 2019
Self-Supervised Learning for Contextualized Extractive Summarization
- Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang. ACL 2019
A Mutual Information Maximization Perspective of Language Representation Learning
- Lingpeng Kong, Cyprien de Masson d'Autume, Lei Yu, Wang Ling, Zihang Dai, Dani Yogatama. ICLR 2020
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
- Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai. ICLR 2020

ASR

Learning Robust and Multilingual Speech Representations
- Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord
Unsupervised pretraining transfers well across languages
- Morgane Riviere, Armand Joulin, Pierre-Emmanuel Mazare, Emmanuel Dupoux
wav2vec: Unsupervised Pre-Training for Speech Recognition
- Steffen Schneider, Alexei Baevski, Ronan Collobert, Michael Auli. INTERSPEECH 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
- Alexei Baevski, Steffen Schneider, Michael Auli. ICLR 2020
Effectiveness of self-supervised pre-training for speech recognition
- Alexei Baevski, Michael Auli, Abdelrahman Mohamed
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
- Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-shan Lee
Self-Training for End-to-End Speech Recognition
- Jacob Kahn, Ann Lee, Awni Hannun. ICASSP 2020
Generative Pre-Training for Speech with Autoregressive Predictive Coding
- Yu-An Chung, James Glass. ICASSP 2020

Talks

The power of Self-Learning Systems. Demis Hassabis (DeepMind).
Supersizing Self-Supervision: Learning Perception and Action without Human Supervision. Abhinav Gupta (CMU).
Self-supervision, Meta-supervision, Curiosity: Making Computers Study Harder. Alyosha Efros (UCB)
Unsupervised Visual Learning Tutorial. CVPR 2018
Self-Supervised Learning. Andrew Zisserman (Oxford & Deepmind).
Graph Embeddings, Content Understanding, & Self-Supervised Learning. Yann LeCun. (NYU & FAIR)
Self-supervised learning: could machines learn like humans? Yann LeCun @EPFL.
Week 9 (b): CS294-158 Deep Unsupervised Learning(Spring 2019). Alyosha Efros @UC Berkeley.

Thesis

Supervision Beyond Manual Annotations for Learning Visual Representations. Carl Doersch. .
Image Synthesis for Self-Supervised Visual Representation Learning. Richard Zhang. .
Visual Learning beyond Direct Supervision. Tinghui Zhou. .
Visual Learning with Minimal Human Supervision. Ishan Misra. .

Blog

Self-Supervised Representation Learning. Lilian Weng. .
The Illustrated Self-Supervised Learning. Amit Chaudhary.

License

To the extent possible under law, has waived all copyright and related or neighboring rights to this work.

付費5元查看完整內容

自監督學習 · 計算機視覺 · 深度神經網絡 · 無監督學習 · 文獻綜述 ·

2020 年 3 月 1 日

[付費5元查看完整內容]【自監督學習深度神經網絡視覺特征學習綜述論文】Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

題目： Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

摘要： 為了在計算機視覺應用中從圖像或視頻中獲得更好的視覺特征學習性能，通常需要大規模的標記數據來訓練深度神經網絡。為了避免大規模數據集收集和標注的大量開銷，作為無監督學習方法的一個子集，提出了一種自監督學習方法，在不使用任何人類標注的標簽的情況下，從大規模無標記數據中學習圖像和視頻的一般特征。本文對基于深度學習的自監督一般視覺特征學習方法進行了廣泛的綜述。首先，描述了該領域的動機、通用管道和術語。在此基礎上，總結了常用的用于自監督學習的深度神經網絡體系結構。接下來，回顧了自監督學習方法的模式和評價指標，然后介紹了常用的圖像和視頻數據集以及現有的自監督視覺特征學習方法。最后，總結和討論了基于基準數據集的定量性能比較方法在圖像和視頻特征學習中的應用。最后，對本文的研究進行了總結，并提出了一套具有發展前景的自監督視覺特征學習方法。

付費5元查看完整內容

自主學習 · 語音處理 · 人機交互 · 自監督學習 · 語料庫 ·

2020 年 2 月 16 日

[付費5元查看完整內容]【北郵-騰訊AI】自監督學習音視覺說話人認證，Self-supervised learning for audio-visual speaker diarization

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

題目： Self-supervised learning for audio-visual speaker diarization

摘要：

主講人二值化是一種尋找特定主講人語音片段的技術，在視頻會議、人機交互系統等以人為中心的應用中得到了廣泛的應用。在這篇論文中，我們提出一種自監督的音視頻同步學習方法來解決說話人的二值化問題，而不需要大量的標注工作。我們通過引入兩個新的損失函數:動態三重損失和多項式損失來改進前面的方法。我們在一個真實的人機交互系統上進行了測試，結果表明我們的最佳模型獲得了顯著的+8%的f1分數，并降低了二值化的錯誤率。最后，我們介紹了一種新的大規模的音視頻語料庫，以填補漢語音視頻數據集的空白。

付費5元查看完整內容

計算機視覺 · 模式識別 · 機器學習 · Haifeng Liu ·

2020 年 1 月 11 日

[付費5元查看完整內容]【浙江大學-AAAI2020】領域自適應的對抗損失，Adversarial-Learned Loss for Domain Adaptation

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

題目： Adversarial-Learned Loss for Domain Adaptation

摘要： 近年來，在跨領域學習可轉移表征方面取得了顯著的進展。以往的領域適應研究主要基于兩種技術：領域對抗學習和自我訓練。然而，領域對抗性學習只會調整領域之間的特征分布，而不考慮目標特征是否具有區分性。另一方面，自訓練利用模型預測來增強目標特征的識別，但無法明確地指定領域分布。為了將這兩種方法的優點結合起來，我們提出了一種新的領域自適應的通用學習損失（ALDA）方法，首先分析了一種典型的自訓練方法偽標簽方法。然而，偽標簽和地面真實性之間存在差距，這可能導致錯誤的訓練。因此，我們引入了混淆矩陣，通過對抗性的方式在ALDA中學習，以減少gap并對齊特征分布。最后，從學習的混淆矩陣中自動構造一個新的損失函數，作為未標記目標樣本的損失。在四標準域自適應數據集中，OurALDA優于最新方法。

作者簡介： Haifeng Liu，博士，浙江大學計算機學院副教授。個人主頁：//person.zju.edu.cn/en/hfliu

付費5元查看完整內容

編碼器 · 解碼器 · 序列 · 語音識別 · 無監督學習 ·

2020 年 1 月 5 日

[付費5元查看完整內容]【中科院自動化所】序列到序列語音識別的無監督預訓練（Unsupervised pre-training for sequence to sequence speech recognition）

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

題目： Unsupervised pre-training for sequence to sequence speech recognition

摘要：

本文提出了一種新的編碼-解碼器序列到序列預訓練模型(seq2seq)。我們的前訓練方法分為兩個階段，分別是聲學前訓練和語言前訓練。在聲學預訓練階段，我們使用大量的語音來預訓練編碼器，通過預測掩蔽語音特征塊及其上下文。在語言前訓練階段，我們使用單說話文本到語音(TTS)系統從大量的文本中生成合成語音，并使用合成的成對數據對譯碼器進行預訓練。這種兩階段預訓練方法將豐富的聲學和語言知識整合到seq2seq模型中，有利于后續的自動語音識別(ASR)任務。在AISHELL-2數據集上完成無監督的預訓練，我們將預訓練模型應用于AISHELL-1和香港科技大學的多重配對數據比率。我們的相對錯誤率由AISHELL-1的38.24%降至7.88%，由香港科技大學的12.00%降至1.20%。此外，將我們的預訓練模型應用到帶有CALLHOME數據集的跨語言案例中。對于CALLHOME數據集中的所有六種語言，我們的預訓練方法使模型始終優于基線。

作者：

徐波，研究員，1988年畢業于浙江大學，現任中國科學院自動化所所長，研究領域包括：多語言語音識別與機器翻譯、多媒體網絡內容智能處理、互動沉浸式3D互聯網等。

付費5元查看完整內容

有監督學習 · 識別技術 · 跨領域 · Boxiao Pan · Zhangjie Cao ·

2019 年 12 月 26 日

[付費5元查看完整內容]【斯坦福大學】具有共同注意力的對抗性跨域動作識別（Adversarial Cross-Domain Action Recognition with Co-Attention）

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

題目： Adversarial Cross-Domain Action Recognition with Co-Attention

摘要： 動作識別是一個被廣泛研究的課題，其研究重點是有監督的學習，包括足夠多的視頻。然而，跨域動作識別的問題，即訓練和測試視頻是從不同的底層分布中提取出來的，在很大程度上仍然沒有得到充分的研究。以往的方法直接采用跨域圖像識別技術，容易出現嚴重的時間錯位問題。提出了一種時間協同注意網絡（TCoN），該網絡利用一種新的跨域協同注意機制，對源域和目標域之間的時間對準動作特征分布進行了匹配。在三個跨域動作識別數據集上的實驗結果表明，在跨域設置下，TCoN顯著地改進了以往的單域和跨域方法。

作者簡介： Boxiao Pan，斯坦福大學視覺與學習實驗室的碩士。他對構建能夠解釋和理解以人為中心的行為、場景和事件的智能系統非常著迷，尤其是通過視頻輸入。//cs.stanford.edu/~bxpan/

Zhangjie Cao，斯坦福大學計算機科學系的博士。

付費5元查看完整內容

無監督學習 · 加州大學圣塔芭芭拉分校（UCSB） · 自監督學習 · AAAI · 人工智能 ·

2019 年 11 月 11 日

[付費5元查看完整內容]【AAAI2020接受論文】多任務自監督學習的不流利檢測，Multi-Task Self-Supervised Learning for Disfluency Detection

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

摘要：現有的不流利檢測方法大多嚴重依賴人工標注的數據，而在實踐中獲取這些數據的成本很高。為了解決訓練數據的瓶頸，我們研究了將多個自監督任務相結合的方法。在監督任務中，無需人工標記就可以收集數據。首先，我們通過隨機添加或刪除未標記新聞數據中的單詞來構建大規模的偽訓練數據，并提出了兩個自我監督的訓練前任務:(i)標記任務來檢測添加的噪聲單詞。(ii)對句子進行分類，區分原句和語法錯誤句子。然后我們將這兩個任務結合起來共同訓練一個網絡。然后使用人工標注的不流利檢測訓練數據對訓練前的網絡進行微調。在常用的英語交換機測試集上的實驗結果表明，與以前的系統(使用完整數據集進行訓練)相比，我們的方法只需使用不到1%(1000個句子)的訓練數據，就可以獲得具有競爭力的性能。我們的方法在全數據集上進行訓練，明顯優于以前的方法，在英語Switchboard上將錯誤率降低了21%。

付費5元查看完整內容

有監督學習 · 機器學習 · 模仿學習 · 深度學習 · 多倫多大學 (University of Toronto) ·

2019 年 11 月 11 日

[付費5元查看完整內容]【CoRL2019最佳論文】模仿學習，A Divergence Minimization Perspective on Imitation Learning Methods

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

論文題目： A Divergence Minimization Perspective on Imitation Learning Methods

論文摘要： 在許多情況下，希望通過專家演示的學習或引導來學習決策和控制策略。這種模仿學習（IL）框架下最常見的方法是行為克隆（BC）和逆強化學習（IRL）。IRL的最新方法已經證明了可以通過訪問非常有限的一組演示來學習有效策略的能力，一種情況BC方法經常失敗。不幸的是，由于變化的多種因素，直接比較這些方法并不能提供足夠的直覺來理解這種性能差異。在這項工作中，我們提出了基于散度最小化的IL算法的統一概率觀點。我們提出了f-MAX，這是AIRL的一種泛化概括，它是一種最新的IRL方法。 f-MAX使我們能夠關聯以前的IRL方法，例如GAIL和AIRL，并了解它們的算法特性。通過散度最小化的鏡頭，我們可以找出BC和成功的IRL方法之間的差異，并在模擬的高維連續控制域上經驗地評估這些細微差別。我們的發現最終確定了IRL的州際匹配目標是其卓越績效的最大貢獻。最后，我們將對IL方法的新理解應用于狀態-邊際匹配的問題，其中我們證明了在模擬推臂環境中，我們可以使用簡單的手動指定狀態分布來教給代理各種行為，而無需獎勵函數或專家。

論文作者： Richard Zemel ，Vector人工智能研究所的聯合創始人兼研究總監，多倫多大學機器學習工業研究主席，加拿大高級研究所高級研究員，研究興趣包括：圖像和文本的生成模型，基于圖的機器學習，少量數據學習，詞典，單詞列表和公平性。

github鏈接： //github.com/KamyarGh/rl_swiss/blob/master/reproducing/fmax_paper.md

付費5元查看完整內容