wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Jun 20, 2020 Quantization Self-Supervised Learning
Code Code Available 3SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes Jun 13, 2025 Linear evaluation Self-Supervised Learning
Code Code Available 2Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20^th century Urban Landscapes with Satellite Imageries Jun 11, 2025 Segmentation Self-Supervised Learning
Code Code Available 2VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models May 29, 2025 Self-Supervised Learning Video Generation
Code Code Available 2RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing Mar 13, 2025 Computational Efficiency Mamba
Code Code Available 2PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection Jan 23, 2025 object-detection Object Detection
Code Code Available 2A generalizable 3D framework and model for self-supervised learning in medical imaging Jan 20, 2025 Medical Image Segmentation Self-Supervised Learning
Code Code Available 2Scaling up self-supervised learning for improved surgical foundation models Jan 16, 2025 Self-Supervised Learning Semantic Segmentation
Code Code Available 2An OpenMind for 3D medical vision self-supervised learning Dec 22, 2024 Benchmarking Self-Supervised Learning
Code Code Available 2FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning Dec 16, 2024 DeepFake Detection diffusion-generated faces detection
Code Code Available 2GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving Nov 19, 2024 3D Object Detection Autonomous Driving
Code Code Available 2Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks Oct 30, 2024 General Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 2PaPaGei: Open Foundation Models for Optical Physiological Signals Oct 27, 2024 Contrastive Learning Domain Generalization
Code Code Available 2TabDPT: Scaling Tabular Foundation Models Oct 23, 2024 In-Context Learning Self-Supervised Learning
Code Code Available 2TIPS: Text-Image Pretraining with Spatial Awareness Oct 21, 2024 Depth Estimation Image Captioning
Code Code Available 2DM-Codec: Distilling Multimodal Representations for Speech Tokenization Oct 19, 2024 Self-Supervised Learning Speech Tokenization
Code Code Available 2A Multimodal Vision Foundation Model for Clinical Dermatology Oct 19, 2024 Diagnostic Lesion Segmentation
Code Code Available 2Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective Oct 16, 2024 Conditional Image Generation Image Generation
Code Code Available 2Sylber: Syllabic Embedding Representation of Speech from Raw Audio Oct 9, 2024 Language Modeling Language Modelling
Code Code Available 2A Survey of Spatio-Temporal EEG data Analysis: from Models to Applications Sep 26, 2024 EEG Self-Supervised Learning
Code Code Available 2Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection Sep 26, 2024 Event Detection Representation Learning
Code Code Available 2DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks Sep 10, 2024 Contrastive Learning Image Reconstruction
Code Code Available 2A Survey on Mixup Augmentations and Beyond Sep 8, 2024 Image Classification Self-Supervised Learning
Code Code Available 2LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings Aug 25, 2024 Language Modelling Link Prediction
Code Code Available 2PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders Aug 16, 2024 3D Object Classification 3D Point Cloud Classification
Code Code Available 2Snuffy: Efficient Whole Slide Image Classifier Aug 15, 2024 Breast Cancer Detection Lung Cancer Diagnosis
Code Code Available 2Multistain Pretraining for Slide Representation Learning in Pathology Aug 5, 2024 Representation Learning Self-Supervised Learning
Code Code Available 2Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation Aug 5, 2024 Rhythm Self-Supervised Learning
Code Code Available 2Exploring the Effect of Dataset Diversity in Self-Supervised Learning for Surgical Computer Vision Jul 25, 2024 Diversity Medical Image Analysis
Code Code Available 2Mono-ViFI: A Unified Learning Framework for Self-supervised Single- and Multi-frame Monocular Depth Estimation Jul 19, 2024 Data Augmentation Depth Estimation
Code Code Available 2TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data Jul 10, 2024 Contrastive Learning multimodal interaction
Code Code Available 2Diffusion Models and Representation Learning: A Survey Jun 30, 2024 Denoising Representation Learning
Code Code Available 2DiffMM: Multi-Modal Diffusion Model for Recommendation Jun 17, 2024 Contrastive Learning model
Code Code Available 2An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios Jun 13, 2024 Language Identification Self-Supervised Learning
Code Code Available 2Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection Jun 12, 2024 Computational Efficiency Self-Supervised Learning
Code Code Available 2XRec: Large Language Models for Explainable Recommendation Jun 4, 2024 Collaborative Filtering Decision Making
Code Code Available 2SelfGNN: Self-Supervised Graph Neural Networks for Sequential Recommendation May 31, 2024 Graph Neural Network Recommendation Systems
Code Code Available 2Transcriptomics-guided Slide Representation Learning in Computational Pathology May 19, 2024 Contrastive Learning Representation Learning
Code Code Available 2Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask May 9, 2024 Anomaly Detection Imputation
Code Code Available 2The Entropy Enigma: Success and Failure of Entropy Minimization May 8, 2024 Self-Supervised Learning
Code Code Available 2Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations May 3, 2024 Optical Flow Estimation Reference-based Super-Resolution
Code Code Available 2TFPred: Learning Discriminative Representations from Unlabeled Data for Few-Label Rotating Machinery Fault Diagnosis May 1, 2024 Fault Detection Fault Diagnosis
Code Code Available 2Vim4Path: Self-Supervised Vision Mamba for Histopathology Images Apr 20, 2024 Diagnostic Mamba
Code Code Available 2Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology Apr 16, 2024 Drug Discovery Self-Supervised Learning
Code Code Available 2MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild Apr 13, 2024 cross-modal alignment Dynamic Facial Expression Recognition
Code Code Available 2OmniSat: Self-Supervised Modality Fusion for Earth Observation Apr 12, 2024 Diversity Earth Observation
Code Code Available 2NeuroNet: A Novel Hybrid Self-Supervised Learning Framework for Sleep Stage Classification Using Single-Channel EEG Apr 10, 2024 Contrastive Learning EEG
Code Code Available 2Test-Time Zero-Shot Temporal Action Localization Apr 8, 2024 Action Localization Language Modelling
Code Code Available 2MedIAnomaly: A comparative study of anomaly detection in medical images Apr 6, 2024 Anomaly Classification Anomaly Detection
Code Code Available 2A Comprehensive Survey on Self-Supervised Learning for Recommendation Apr 4, 2024 Contrastive Learning Recommendation Systems
Code Code Available 2