| Dynamic data sampler for cross-language transfer learning in large language models | May 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models | Jun 5, 2025 | RerankingRetrieval | CodeCode Available | 5 |
| DepthSplat: Connecting Gaussian Splatting and Depth | Oct 17, 2024 | Depth EstimationNovel View Synthesis | CodeCode Available | 5 |
| Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI | May 29, 2024 | EEGElectroencephalogram (EEG) | CodeCode Available | 4 |
| A Survey on Data Selection for Language Models | Feb 26, 2024 | SurveyUnsupervised Pre-training | CodeCode Available | 3 |
| CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design | Apr 3, 2025 | Band GapDielectric Constant | CodeCode Available | 2 |
| FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning | Dec 16, 2024 | DeepFake Detectiondiffusion-generated faces detection | CodeCode Available | 2 |
| Foundation Policies with Hilbert Representations | Feb 23, 2024 | Reinforcement Learning (RL)Unsupervised Pre-training | CodeCode Available | 2 |
| SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery | Jul 17, 2022 | Land Cover ClassificationSemantic Segmentation | CodeCode Available | 2 |
| Large-Scale Pre-training for Person Re-identification with Noisy Labels | Mar 30, 2022 | Contrastive LearningMulti-Object Tracking | CodeCode Available | 2 |
| SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model | Jun 2, 2025 | Mixture-of-ExpertsUnsupervised Pre-training | CodeCode Available | 1 |
| PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification | Aug 10, 2024 | Contrastive LearningPerson Re-Identification | CodeCode Available | 1 |
| ConStyle v2: A Strong Prompter for All-in-One Image Restoration | Jun 26, 2024 | AllGPU | CodeCode Available | 1 |
| PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning | May 23, 2024 | reinforcement-learningReinforcement Learning | CodeCode Available | 1 |
| BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers | Apr 29, 2024 | RetrievalUnsupervised Pre-training | CodeCode Available | 1 |
| Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval | Jan 20, 2024 | DecoderPassage Retrieval | CodeCode Available | 1 |
| Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding | Nov 6, 2023 | Action UnderstandingRepresentation Learning | CodeCode Available | 1 |
| METRA: Scalable Unsupervised RL with Metric-Aware Abstraction | Oct 13, 2023 | Reinforcement Learning (RL)Unsupervised Pre-training | CodeCode Available | 1 |
| HIQL: Offline Goal-Conditioned RL with Latent States as Actions | Jul 22, 2023 | Reinforcement Learning (RL)Unsupervised Pre-training | CodeCode Available | 1 |
| Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning | May 29, 2023 | Autonomous DrivingDecoder | CodeCode Available | 1 |
| Rethinking Semi-supervised Learning with Language Models | May 22, 2023 | Pseudo LabelSemi-Supervised Text Classification | CodeCode Available | 1 |
| PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis | May 20, 2023 | Transfer LearningUnsupervised Pre-training | CodeCode Available | 1 |
| FreePoint: Unsupervised Point Cloud Instance Segmentation | May 11, 2023 | Instance SegmentationSegmentation | CodeCode Available | 1 |
| Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner | May 2, 2023 | SentenceUnsupervised Pre-training | CodeCode Available | 1 |
| Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages | Mar 28, 2023 | Data Augmentationtext-to-speech | CodeCode Available | 1 |
| MultiTalent: A Multi-Dataset Approach to Medical Image Segmentation | Mar 25, 2023 | Image SegmentationLesion Segmentation | CodeCode Available | 1 |
| DocILE Benchmark for Document Information Localization and Extraction | Feb 11, 2023 | Key Information ExtractionUnsupervised Pre-training | CodeCode Available | 1 |
| ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection | Jul 26, 2022 | 3D Object Detectionobject-detection | CodeCode Available | 1 |
| Unsupervised pre-training of graph transformers on patient population graphs | Jul 21, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CARLANE: A Lane Detection Benchmark for Unsupervised Domain Adaptation from Simulation to multiple Real-World Domains | Jun 16, 2022 | 2D Semantic SegmentationAutonomous Driving | CodeCode Available | 1 |
| Self-Supervised Visual Representation Learning with Semantic Grouping | May 30, 2022 | Contrastive LearningInstance Segmentation | CodeCode Available | 1 |
| Semi-supervised 3D shape segmentation with multilevel consistency and part substitution | Apr 19, 2022 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| ELECTRIcity: An Efficient Transformer for Non-Intrusive Load Monitoring | Apr 11, 2022 | Non-Intrusive Load MonitoringUnsupervised Pre-training | CodeCode Available | 1 |
| Unsupervised Pre-training for Temporal Action Localization Tasks | Mar 25, 2022 | Action LocalizationContrastive Learning | CodeCode Available | 1 |
| Reinforcement Learning with Action-Free Pre-Training from Videos | Mar 25, 2022 | Predictionreinforcement-learning | CodeCode Available | 1 |
| Unsupervised Pre-Training on Patient Population Graphs for Patient-Level Predictions | Mar 23, 2022 | Disease PredictionImputation | CodeCode Available | 1 |
| Korean-Specific Dataset for Table Question Answering | Jan 17, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| The CLEAR Benchmark: Continual LEArning on Real-World Imagery | Jan 17, 2022 | Continual Learningimage-classification | CodeCode Available | 1 |
| AI-Bind: Improving Binding Predictions for Novel Protein Targets and Ligands | Dec 25, 2021 | Drug DiscoveryUnsupervised Pre-training | CodeCode Available | 1 |
| SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations | Dec 9, 2021 | Contrastive LearningUnsupervised Pre-training | CodeCode Available | 1 |
| Bag of Tricks and A Strong baseline for Image Copy Detection | Nov 13, 2021 | Copy DetectionUnsupervised Pre-training | CodeCode Available | 1 |
| D^2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection | Nov 13, 2021 | Copy DetectionUnsupervised Pre-training | CodeCode Available | 1 |
| Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition | Sep 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection | Aug 17, 2021 | 2D Object Detection3D Object Detection | CodeCode Available | 1 |
| PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training | Jun 9, 2021 | reinforcement-learningReinforcement Learning (RL) | CodeCode Available | 1 |
| Exploring the Limits of Out-of-Distribution Detection | Jun 6, 2021 | Out-of-Distribution DetectionOut of Distribution (OOD) Detection | CodeCode Available | 1 |
| Initialization and Regularization of Factorized Neural Layers | May 3, 2021 | Knowledge DistillationModel Compression | CodeCode Available | 1 |
| Patient Contrastive Learning: a Performant, Expressive, and Practical Approach to ECG Modeling | Apr 9, 2021 | Contrastive LearningElectrocardiography (ECG) | CodeCode Available | 1 |
| Pre-training strategies and datasets for facial representation learning | Mar 30, 2021 | 3D Face Reconstruction3D Facial Landmark Localization | CodeCode Available | 1 |
| Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data | Mar 30, 2021 | Change DetectionSelf-Supervised Learning | CodeCode Available | 1 |