| Efficient Teacher: Semi-Supervised Object Detection for YOLOv5 | Feb 15, 2023 | Objectobject-detection | CodeCode Available | 2 | 5 |
| Neural Prompt Search | Jun 9, 2022 | Few-Shot LearningImage Classification | CodeCode Available | 2 | 5 |
| InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding | Jun 8, 2023 | DecoderMulti-Task Learning | CodeCode Available | 2 | 5 |
| PERF: Panoramic Neural Radiance Field from a Single Panorama | Oct 25, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 | 5 |
| Diffusion Transformer Policy | Oct 21, 2024 | DenoisingVision-Language-Action | CodeCode Available | 2 | 5 |
| Geometric Transformer with Interatomic Positional Encoding | Sep 21, 2023 | | CodeCode Available | 2 | 5 |
| NeoBERT: A Next-Generation BERT | Feb 26, 2025 | In-Context LearningMTEB Benchmark | CodeCode Available | 2 | 5 |
| ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image | Dec 12, 2023 | Image SegmentationInteractive Segmentation | CodeCode Available | 2 | 5 |
| MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization | Jul 14, 2025 | 2kImage Generation | CodeCode Available | 2 | 5 |
| The pitfalls of next-token prediction | Mar 11, 2024 | MambaMisconceptions | CodeCode Available | 2 | 5 |
| Token-level Direct Preference Optimization | Apr 18, 2024 | Diversity | CodeCode Available | 2 | 5 |
| HTR-VT: Handwritten Text Recognition with Vision Transformer | Sep 13, 2024 | Handwritten Text RecognitionHTR | CodeCode Available | 2 | 5 |
| INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | Sep 25, 2024 | GPUQuantization | CodeCode Available | 2 | 5 |
| KV-Compress: Paged KV-Cache Compression with Variable Compression Rates per Attention Head | Sep 30, 2024 | | CodeCode Available | 2 | 5 |
| TinyFusion: Diffusion Transformers Learned Shallow | Dec 2, 2024 | Image Generation | CodeCode Available | 2 | 5 |
| CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation | Jan 16, 2025 | 3D Generation4k | CodeCode Available | 2 | 5 |
| IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages | Apr 25, 2024 | Cross-Lingual Question AnsweringDiversity | CodeCode Available | 2 | 5 |
| Just read twice: closing the recall gap for recurrent language models | Jul 7, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 2 | 5 |
| Masked Visual Pre-training for Motor Control | Mar 11, 2022 | Robot Manipulation GeneralizationState Estimation | CodeCode Available | 2 | 5 |
| LPCNet: Improving Neural Speech Synthesis Through Linear Prediction | Oct 28, 2018 | PredictionSpeech Synthesis | CodeCode Available | 2 | 5 |
| REaLTabFormer: Generating Realistic Relational and Tabular Data using Transformers | Feb 4, 2023 | Synthetic Data Generation | CodeCode Available | 2 | 5 |
| Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis | Sep 21, 2024 | Model EditingPrediction | CodeCode Available | 2 | 5 |
| Prediction-Powered Inference | Jan 23, 2023 | AstronomyPrediction | CodeCode Available | 2 | 5 |
| RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| Towards Evaluating and Building Versatile Large Language Models for Medicine | Aug 22, 2024 | Multiple-choicenamed-entity-recognition | CodeCode Available | 2 | 5 |
| Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors | May 29, 2023 | Contrastive LearningImage Reconstruction | CodeCode Available | 2 | 5 |
| Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning | Jun 6, 2024 | Multi-Task LearningVulnerability Detection | CodeCode Available | 2 | 5 |
| LangProp: A code optimization framework using Large Language Models applied to driving | Jan 18, 2024 | Autonomous DrivingCode Generation | CodeCode Available | 2 | 5 |
| g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin | Mar 20, 2022 | Part-Of-Speech TaggingPolyphone disambiguation | CodeCode Available | 2 | 5 |
| DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences | Mar 8, 2024 | | CodeCode Available | 2 | 5 |
| Optimal Flow Matching: Learning Straight Trajectories in Just One Step | Mar 19, 2024 | | CodeCode Available | 2 | 5 |
| MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection | Jul 23, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| Scaling Diffusion Transformers Efficiently via μP | May 21, 2025 | Image GenerationText to Image Generation | CodeCode Available | 2 | 5 |
| Ontology Embedding: A Survey of Methods, Applications and Resources | Jun 16, 2024 | Logical ReasoningOntology Embedding | CodeCode Available | 2 | 5 |
| Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data | Mar 27, 2025 | Text to 3D | CodeCode Available | 2 | 5 |
| Segment and Caption Anything | Dec 1, 2023 | Caption Generationobject-detection | CodeCode Available | 2 | 5 |
| Multi-Scale Representations by Varying Window Attention for Semantic Segmentation | Apr 25, 2024 | DecoderSemantic Segmentation | CodeCode Available | 2 | 5 |
| Heterogeneous Multi-Robot Reinforcement Learning | Jan 17, 2023 | Graph Neural NetworkMulti-agent Reinforcement Learning | CodeCode Available | 2 | 5 |
| FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference | Feb 28, 2025 | | CodeCode Available | 2 | 5 |
| TRADES: Generating Realistic Market Simulations with Diffusion Models | Jan 31, 2025 | Denoising | CodeCode Available | 2 | 5 |
| Learning to Compress Prompts with Gist Tokens | Apr 17, 2023 | Decoder | CodeCode Available | 2 | 5 |
| Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models | Oct 6, 2023 | Code GenerationDecision Making | CodeCode Available | 2 | 5 |
| Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation | Aug 24, 2023 | Image-to-Image Translation | CodeCode Available | 2 | 5 |
| Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey | Feb 8, 2025 | FairnessRAG | CodeCode Available | 2 | 5 |
| LongReward: Improving Long-context Large Language Models with AI Feedback | Oct 28, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 2 | 5 |
| LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models | Mar 4, 2022 | DecoderGPU | CodeCode Available | 2 | 5 |
| Conformal Symplectic Optimization for Stable Reinforcement Learning | Dec 3, 2024 | Atari GamesDeep Reinforcement Learning | CodeCode Available | 2 | 5 |
| SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs | Jun 5, 2025 | | CodeCode Available | 2 | 5 |
| A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation | Oct 2, 2024 | Image GenerationQuantization | CodeCode Available | 2 | 5 |
| Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping | Apr 9, 2024 | Image RetrievalObject | CodeCode Available | 2 | 5 |