| Mixture of Tokens: Continuous MoE through Cross-Example Aggregation | Oct 24, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 | 5 |
| Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery | Jan 12, 2024 | Object RecognitionRoad Segmentation | CodeCode Available | 2 | 5 |
| TeethDreamer: 3D Teeth Reconstruction from Five Intra-oral Photographs | Jul 16, 2024 | Surface Reconstruction | CodeCode Available | 2 | 5 |
| ViM-UNet: Vision Mamba for Biomedical Segmentation | Apr 11, 2024 | Instance SegmentationMamba | CodeCode Available | 2 | 5 |
| ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data | Nov 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Progressive Focused Transformer for Single Image Super-Resolution | Mar 26, 2025 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 | 5 |
| Dynamic Diffusion Transformer | Oct 4, 2024 | Image Generation | CodeCode Available | 2 | 5 |
| Guided Real Image Dehazing using YCbCr Color Space | Dec 23, 2024 | Image Dehazing | CodeCode Available | 2 | 5 |
| Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models | Mar 30, 2023 | Video AlignmentVideo Editing | CodeCode Available | 2 | 5 |
| ChangeDiff: A Multi-Temporal Change Detection Data Generator with Flexible Text Prompts via Diffusion Model | Dec 20, 2024 | Change Detection | CodeCode Available | 2 | 5 |
| RORem: Training a Robust Object Remover with Human-in-the-Loop | Jan 1, 2025 | Object | CodeCode Available | 2 | 5 |
| CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation | Nov 15, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 2 | 5 |
| DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR | Jan 28, 2022 | 2D Object DetectionObject Detection | CodeCode Available | 2 | 5 |
| What Can Transformers Learn In-Context? A Case Study of Simple Function Classes | Aug 1, 2022 | In-Context Learning | CodeCode Available | 2 | 5 |
| StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets | Feb 1, 2022 | Image Generation | CodeCode Available | 2 | 5 |
| DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models | Mar 15, 2024 | RAGRetrieval | CodeCode Available | 2 | 5 |
| RecGPT: A Foundation Model for Sequential Recommendation | Jun 6, 2025 | Decodermodel | CodeCode Available | 2 | 5 |
| A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point Processes | Jan 8, 2025 | Point Processes | CodeCode Available | 2 | 5 |
| Style-Based Global Appearance Flow for Virtual Try-On | Apr 3, 2022 | Virtual Try-on | CodeCode Available | 2 | 5 |
| KAN or MLP: A Fairer Comparison | Jul 23, 2024 | Continual Learning | CodeCode Available | 2 | 5 |
| Dropout Reduces Underfitting | Mar 2, 2023 | | CodeCode Available | 2 | 5 |
| Rethinking Semantic Segmentation: A Prototype View | Mar 28, 2022 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding | Jul 3, 2024 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs | Sep 8, 2024 | Entity LinkingRAG | CodeCode Available | 2 | 5 |
| Rethinking Mobile Block for Efficient Attention-based Models | Jan 3, 2023 | Unity | CodeCode Available | 2 | 5 |
| Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds | Mar 19, 2022 | 3D Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| LambdaKG: A Library for Pre-trained Language Model-Based Knowledge Graph Embeddings | Oct 1, 2022 | Graph Representation LearningKnowledge Graph Completion | CodeCode Available | 2 | 5 |
| CenterNet++ for Object Detection | Apr 18, 2022 | Objectobject-detection | CodeCode Available | 2 | 5 |
| EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI | Dec 26, 2023 | Scene Understanding | CodeCode Available | 2 | 5 |
| SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction | Oct 17, 2024 | Quantization | CodeCode Available | 2 | 5 |
| GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions | Mar 16, 2020 | Additive models | CodeCode Available | 2 | 5 |
| ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation | Jan 4, 2024 | Decoderparameter-efficient fine-tuning | CodeCode Available | 2 | 5 |
| A self-supervised CNN for image watermark removal | Mar 9, 2024 | | CodeCode Available | 2 | 5 |
| A Consistency-Aware Spot-Guided Transformer for Versatile and Hierarchical Point Cloud Registration | Oct 14, 2024 | Point Cloud Registration | CodeCode Available | 2 | 5 |
| NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields | Apr 1, 2024 | 3D Object DetectionNeRF | CodeCode Available | 2 | 5 |
| Query2CAD: Generating CAD models using natural language queries | May 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers | Jan 7, 2025 | DiversityText-to-Video Generation | CodeCode Available | 2 | 5 |
| What is the Role of Small Models in the LLM Era: A Survey | Sep 10, 2024 | | CodeCode Available | 2 | 5 |
| Methods for Detoxification of Texts for the Russian Language | May 19, 2021 | Style Transfer | CodeCode Available | 2 | 5 |
| NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation | Mar 3, 2022 | DecoderDepth Estimation | CodeCode Available | 2 | 5 |
| GTA: A Benchmark for General Tool Agents | Jul 11, 2024 | | CodeCode Available | 2 | 5 |
| Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers | Dec 13, 2023 | 3D Question Answering (3D-QA)Attribute | CodeCode Available | 2 | 5 |
| Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding | Feb 3, 2025 | Quantization | CodeCode Available | 2 | 5 |
| Sketch and Refine: Towards Fast and Accurate Lane Detection | Jan 26, 2024 | Lane Detection | CodeCode Available | 2 | 5 |
| Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On | Apr 1, 2024 | DenoisingImage Generation | CodeCode Available | 2 | 5 |
| DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention | Jan 12, 2023 | Image Dehazing | CodeCode Available | 2 | 5 |
| FairyGen: Storied Cartoon Video from a Single Child-Drawn Character | Jun 26, 2025 | | CodeCode Available | 2 | 5 |
| MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models | May 21, 2025 | Computational Efficiency | CodeCode Available | 2 | 5 |
| Pair-VPR: Place-Aware Pre-training and Contrastive Pair Classification for Visual Place Recognition with Vision Transformers | Oct 9, 2024 | DecoderRe-Ranking | CodeCode Available | 2 | 5 |
| DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization | May 18, 2025 | Mathematical Reasoning | CodeCode Available | 2 | 5 |