| Global birdsong embeddings enable superior transfer learning for bioacoustic classification | Jul 12, 2023 | Audio ClassificationDecision Making | CodeCode Available | 2 |
| Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture | Jan 19, 2023 | Depth EstimationDepth Prediction | CodeCode Available | 2 |
| MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism | Mar 3, 2025 | Object Detection | CodeCode Available | 2 |
| TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks | Sep 16, 2020 | Anomaly DetectionBenchmarking | CodeCode Available | 2 |
| When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset | Jul 14, 2024 | 3D Object DetectionMultispectral Object Detection | CodeCode Available | 2 |
| Differentiable All-pole Filters for Time-varying Audio Systems | Apr 11, 2024 | AllAudio Effects Modeling | CodeCode Available | 2 |
| TorchOpt: An Efficient Library for Differentiable Optimization | Nov 13, 2022 | CPUGPU | CodeCode Available | 2 |
| Progressive Growing of GANs for Improved Quality, Stability, and Variation | Oct 27, 2017 | Face GenerationImage Generation | CodeCode Available | 2 |
| CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities | Aug 23, 2024 | DenoisingMotion Generation | CodeCode Available | 2 |
| ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization | May 16, 2025 | AllDeepFake Detection | CodeCode Available | 2 |
| Cross-Task Generalization via Natural Language Crowdsourcing Instructions | Apr 18, 2021 | Question Answering | CodeCode Available | 2 |
| MagicFace: High-Fidelity Facial Expression Editing with Action-Unit Control | Jan 4, 2025 | AttributeDenoising | CodeCode Available | 2 |
| From Poses to Identity: Training-Free Person Re-Identification via Feature Centralization | Mar 2, 2025 | Cross-Modal Person Re-IdentificationPerson Re-Identification | CodeCode Available | 2 |
| Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions | Jun 9, 2025 | Large Language ModelReinforcement Learning (RL) | CodeCode Available | 2 |
| UAV-DETR: Efficient End-to-End Object Detection for Unmanned Aerial Vehicle Imagery | Jan 3, 2025 | object-detectionObject Detection | CodeCode Available | 2 |
| RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation | Jan 14, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 2 |
| Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning | Oct 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| You Only Segment Once: Towards Real-Time Panoptic Segmentation | Mar 26, 2023 | DecoderPanoptic Segmentation | CodeCode Available | 2 |
| GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation | Dec 4, 2023 | Novel View Synthesis | CodeCode Available | 2 |
| Supervised Learning for Analog and RF Circuit Design: Benchmarks and Comparative Insights | Jan 21, 2025 | | CodeCode Available | 2 |
| Synchformer: Efficient Synchronization from Sparse Cues | Jan 29, 2024 | Audio-Visual Synchronization | CodeCode Available | 2 |
| GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation | Nov 29, 2023 | 3D GenerationText to 3D | CodeCode Available | 2 |
| Implicit Diffusion Models for Continuous Super-Resolution | Mar 29, 2023 | DenoisingImage Super-Resolution | CodeCode Available | 2 |
| Spherical Transformer for LiDAR-based 3D Recognition | Mar 22, 2023 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 2 |
| The Sound Demixing Challenge 2023 x2013 Music Demixing Track | Aug 14, 2023 | Music Source Separation | CodeCode Available | 2 |
| Lenia and Expanded Universe | May 7, 2020 | Artificial Life | CodeCode Available | 2 |
| Flow Matching on General Geometries | Feb 7, 2023 | | CodeCode Available | 2 |
| Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation | Aug 21, 2022 | Medical Image AnalysisSemantic Segmentation | CodeCode Available | 2 |
| Language Models Can See: Plugging Visual Controls in Text Generation | May 5, 2022 | Image CaptioningImage-text matching | CodeCode Available | 2 |
| Diffusion-GAN: Training GANs with Diffusion | Jun 5, 2022 | Image Generation | CodeCode Available | 2 |
| WebCPM: Interactive Web Search for Chinese Long-form Question Answering | May 11, 2023 | FormInformation Retrieval | CodeCode Available | 2 |
| STD: Stable Triangle Descriptor for 3D place recognition | Sep 26, 2022 | 3D Place Recognition | CodeCode Available | 2 |
| A Transformer approach for Electricity Price Forecasting | Mar 24, 2024 | | CodeCode Available | 2 |
| MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving | Sep 11, 2024 | Autonomous DrivingFeature Engineering | CodeCode Available | 2 |
| DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task | Apr 3, 2023 | | CodeCode Available | 2 |
| Can LLMs Learn New Concepts Incrementally without Forgetting? | Feb 13, 2024 | In-Context LearningIncremental Learning | CodeCode Available | 2 |
| Many-MobileNet: Multi-Model Augmentation for Robust Retinal Disease Classification | Dec 3, 2024 | Computational EfficiencyData Augmentation | CodeCode Available | 2 |
| A Practitioner's Guide to Continual Multimodal Pretraining | Aug 26, 2024 | Continual LearningContinual Pretraining | CodeCode Available | 2 |
| SimCSE: Simple Contrastive Learning of Sentence Embeddings | Apr 18, 2021 | Contrastive LearningData Augmentation | CodeCode Available | 2 |
| Q-VLM: Post-training Quantization for Large Vision-Language Models | Oct 10, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding | Nov 13, 2022 | Brain Computer Interface | CodeCode Available | 2 |
| Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings | Nov 12, 2024 | AttributeComputational Efficiency | CodeCode Available | 2 |
| Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach | Aug 29, 2024 | DiagnosticEEG | CodeCode Available | 2 |
| Sharpness-Aware Minimization for Efficiently Improving Generalization | Oct 3, 2020 | Fine-Grained Image ClassificationImage Classification | CodeCode Available | 2 |
| Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications | Mar 6, 2024 | AttributeData Augmentation | CodeCode Available | 2 |
| In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation | Aug 9, 2024 | Image to textObject | CodeCode Available | 2 |
| TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition | Jul 24, 2023 | Image-Guided CompositionText-to-Image Generation | CodeCode Available | 2 |
| LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning | May 12, 2025 | | CodeCode Available | 2 |
| OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models | May 13, 2023 | Key Information ExtractionNutrition | CodeCode Available | 2 |
| A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark | Jan 1, 2024 | Age EstimationBenchmarking | CodeCode Available | 2 |