| Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning | Feb 17, 2025 | Audio ClassificationAudio Tagging | CodeCode Available | 1 |
| ATST: Audio Representation Learning with Teacher-Student Transformer | Apr 26, 2022 | Audio ClassificationInstrument Recognition | CodeCode Available | 1 |
| Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity | Nov 9, 2021 | Audio ClassificationRetrieval | CodeCode Available | 1 |
| Broaden Your Views for Self-Supervised Video Learning | Mar 30, 2021 | Audio ClassificationOptical Flow Estimation | CodeCode Available | 1 |
| Self-Supervised MultiModal Versatile Networks | Jun 29, 2020 | Action Recognition In VideosAudio Classification | CodeCode Available | 0 |
| Audio-Visual Instance Discrimination with Cross-Modal Agreement | Apr 27, 2020 | Action RecognitionAudio Classification | CodeCode Available | 1 |
| Self-Supervised Learning by Cross-Modal Audio-Video Clustering | Nov 28, 2019 | Action RecognitionAudio Classification | CodeCode Available | 0 |
| Putting An End to End-to-End: Gradient-Isolated Learning of Representations | May 28, 2019 | Representation LearningSelf-Supervised Audio Classification | CodeCode Available | 0 |
| Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization | Jun 30, 2018 | Action RecognitionAudio Classification | —Unverified | 0 |