| AI-powered virtual tissues from spatial proteomics for clinical diagnostics and biomedical discovery | Jan 10, 2025 | | CodeCode Available | 2 |
| Diffusion Time-step Curriculum for One Image to 3D Generation | Apr 6, 2024 | 3D GenerationImage to 3D | CodeCode Available | 2 |
| Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPs | Apr 17, 2025 | Position | CodeCode Available | 2 |
| FORA: Fast-Forward Caching in Diffusion Transformer Acceleration | Jul 1, 2024 | Denoising | CodeCode Available | 2 |
| Arabic-Nougat: Fine-Tuning Vision Transformers for Arabic OCR and Markdown Extraction | Nov 19, 2024 | document understandingOptical Character Recognition (OCR) | CodeCode Available | 2 |
| Combinatorial Optimization with Automated Graph Neural Networks | Jun 5, 2024 | Combinatorial OptimizationGraph Embedding | CodeCode Available | 2 |
| PIGEON: Predicting Image Geolocations | Jul 11, 2023 | Photo geolocation estimation | CodeCode Available | 2 |
| JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs | Feb 8, 2024 | Ethics | CodeCode Available | 2 |
| Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning | Jul 9, 2024 | Image GenerationSentence | CodeCode Available | 2 |
| cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree | Jun 18, 2025 | ChunkingCode Generation | CodeCode Available | 2 |
| MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages | Apr 18, 2022 | intent-classificationIntent Classification | CodeCode Available | 2 |
| Exponentially Faster Language Modelling | Nov 15, 2023 | BenchmarkingCPU | CodeCode Available | 2 |
| Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation | Feb 18, 2025 | DecoderGPU | CodeCode Available | 2 |
| Q-Diffusion: Quantizing Diffusion Models | Feb 8, 2023 | Image GenerationNoise Estimation | CodeCode Available | 2 |
| Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models | Oct 2, 2024 | Mixture-of-ExpertsNavigate | CodeCode Available | 2 |
| LinkBERT: Pretraining Language Models with Document Links | Mar 29, 2022 | Document ClassificationLanguage Modeling | CodeCode Available | 2 |
| Efficient Teacher: Semi-Supervised Object Detection for YOLOv5 | Feb 15, 2023 | Objectobject-detection | CodeCode Available | 2 |
| Neural Prompt Search | Jun 9, 2022 | Few-Shot LearningImage Classification | CodeCode Available | 2 |
| InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding | Jun 8, 2023 | DecoderMulti-Task Learning | CodeCode Available | 2 |
| PERF: Panoramic Neural Radiance Field from a Single Panorama | Oct 25, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 |
| Diffusion Transformer Policy | Oct 21, 2024 | DenoisingVision-Language-Action | CodeCode Available | 2 |
| Geometric Transformer with Interatomic Positional Encoding | Sep 21, 2023 | | CodeCode Available | 2 |
| NeoBERT: A Next-Generation BERT | Feb 26, 2025 | In-Context LearningMTEB Benchmark | CodeCode Available | 2 |
| ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image | Dec 12, 2023 | Image SegmentationInteractive Segmentation | CodeCode Available | 2 |
| MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization | Jul 14, 2025 | 2kImage Generation | CodeCode Available | 2 |
| The pitfalls of next-token prediction | Mar 11, 2024 | MambaMisconceptions | CodeCode Available | 2 |
| Token-level Direct Preference Optimization | Apr 18, 2024 | Diversity | CodeCode Available | 2 |
| HTR-VT: Handwritten Text Recognition with Vision Transformer | Sep 13, 2024 | Handwritten Text RecognitionHTR | CodeCode Available | 2 |
| INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | Sep 25, 2024 | GPUQuantization | CodeCode Available | 2 |
| KV-Compress: Paged KV-Cache Compression with Variable Compression Rates per Attention Head | Sep 30, 2024 | | CodeCode Available | 2 |
| TinyFusion: Diffusion Transformers Learned Shallow | Dec 2, 2024 | Image Generation | CodeCode Available | 2 |
| CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation | Jan 16, 2025 | 3D Generation4k | CodeCode Available | 2 |
| IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages | Apr 25, 2024 | Cross-Lingual Question AnsweringDiversity | CodeCode Available | 2 |
| Just read twice: closing the recall gap for recurrent language models | Jul 7, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| Masked Visual Pre-training for Motor Control | Mar 11, 2022 | Robot Manipulation GeneralizationState Estimation | CodeCode Available | 2 |
| LPCNet: Improving Neural Speech Synthesis Through Linear Prediction | Oct 28, 2018 | PredictionSpeech Synthesis | CodeCode Available | 2 |
| REaLTabFormer: Generating Realistic Relational and Tabular Data using Transformers | Feb 4, 2023 | Synthetic Data Generation | CodeCode Available | 2 |
| Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis | Sep 21, 2024 | Model EditingPrediction | CodeCode Available | 2 |
| Prediction-Powered Inference | Jan 23, 2023 | AstronomyPrediction | CodeCode Available | 2 |
| RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Towards Evaluating and Building Versatile Large Language Models for Medicine | Aug 22, 2024 | Multiple-choicenamed-entity-recognition | CodeCode Available | 2 |
| Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors | May 29, 2023 | Contrastive LearningImage Reconstruction | CodeCode Available | 2 |
| Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning | Jun 6, 2024 | Multi-Task LearningVulnerability Detection | CodeCode Available | 2 |
| LangProp: A code optimization framework using Large Language Models applied to driving | Jan 18, 2024 | Autonomous DrivingCode Generation | CodeCode Available | 2 |
| g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin | Mar 20, 2022 | Part-Of-Speech TaggingPolyphone disambiguation | CodeCode Available | 2 |
| DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences | Mar 8, 2024 | | CodeCode Available | 2 |
| Optimal Flow Matching: Learning Straight Trajectories in Just One Step | Mar 19, 2024 | | CodeCode Available | 2 |
| MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection | Jul 23, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Scaling Diffusion Transformers Efficiently via μP | May 21, 2025 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Ontology Embedding: A Survey of Methods, Applications and Resources | Jun 16, 2024 | Logical ReasoningOntology Embedding | CodeCode Available | 2 |