| DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences | Mar 8, 2024 | | CodeCode Available | 2 |
| Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework | Apr 19, 2024 | Earth ObservationSegmentation | CodeCode Available | 2 |
| g2pW: A Conditional Weighted Softmax BERT for Polyphone Disambiguation in Mandarin | Mar 20, 2022 | Part-Of-Speech TaggingPolyphone disambiguation | CodeCode Available | 2 |
| LangProp: A code optimization framework using Large Language Models applied to driving | Jan 18, 2024 | Autonomous DrivingCode Generation | CodeCode Available | 2 |
| LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding | Feb 28, 2022 | Document Image Classificationdocument understanding | CodeCode Available | 2 |
| GrootVL: Tree Topology is All You Need in State Space Model | Jun 4, 2024 | Allimage-classification | CodeCode Available | 2 |
| Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning | Jun 6, 2024 | Multi-Task LearningVulnerability Detection | CodeCode Available | 2 |
| Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors | May 29, 2023 | Contrastive LearningImage Reconstruction | CodeCode Available | 2 |
| CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models | Nov 28, 2023 | Dialogue Generation | CodeCode Available | 2 |
| Towards Evaluating and Building Versatile Large Language Models for Medicine | Aug 22, 2024 | Multiple-choicenamed-entity-recognition | CodeCode Available | 2 |
| RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis | Mar 24, 2022 | DenoisingImage Denoising | CodeCode Available | 2 |
| AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms | Feb 21, 2025 | Scheduling | CodeCode Available | 2 |
| GuardReasoner: Towards Reasoning-based LLM Safeguards | Jan 30, 2025 | | CodeCode Available | 2 |
| Prediction-Powered Inference | Jan 23, 2023 | AstronomyPrediction | CodeCode Available | 2 |
| Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis | Sep 21, 2024 | Model EditingPrediction | CodeCode Available | 2 |
| Toward General Instruction-Following Alignment for Retrieval-Augmented Generation | Oct 12, 2024 | Instruction FollowingRAG | CodeCode Available | 2 |
| REaLTabFormer: Generating Realistic Relational and Tabular Data using Transformers | Feb 4, 2023 | Synthetic Data Generation | CodeCode Available | 2 |
| LPCNet: Improving Neural Speech Synthesis Through Linear Prediction | Oct 28, 2018 | PredictionSpeech Synthesis | CodeCode Available | 2 |
| Agent Lumos: Unified and Modular Training for Open-Source Language Agents | Nov 9, 2023 | MathQuestion Answering | CodeCode Available | 2 |
| Measuring and Narrowing the Compositionality Gap in Language Models | Oct 7, 2022 | Question Answering | CodeCode Available | 2 |
| Masked Visual Pre-training for Motor Control | Mar 11, 2022 | Robot Manipulation GeneralizationState Estimation | CodeCode Available | 2 |
| Just read twice: closing the recall gap for recurrent language models | Jul 7, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| A Graph-Based Approach for Category-Agnostic Pose Estimation | Nov 29, 2023 | 2D Pose EstimationAnimal Pose Estimation | CodeCode Available | 2 |
| IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages | Apr 25, 2024 | Cross-Lingual Question AnsweringDiversity | CodeCode Available | 2 |
| CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation | Jan 16, 2025 | 3D Generation4k | CodeCode Available | 2 |
| Motion Mamba: Efficient and Long Sequence Motion Generation | Mar 12, 2024 | MambaMotion Generation | CodeCode Available | 2 |
| PlanT: Explainable Planning Transformers via Object-Level Representations | Oct 25, 2022 | CARLA longest6Decision Making | CodeCode Available | 2 |
| Emulating Self-attention with Convolution for Efficient Image Super-Resolution | Mar 9, 2025 | Computational EfficiencyImage Super-Resolution | CodeCode Available | 2 |
| Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds | Mar 19, 2022 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| Guided Real Image Dehazing using YCbCr Color Space | Dec 23, 2024 | Image Dehazing | CodeCode Available | 2 |
| Blockwise Parallel Transformer for Large Context Models | May 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| TinyFusion: Diffusion Transformers Learned Shallow | Dec 2, 2024 | Image Generation | CodeCode Available | 2 |
| KV-Compress: Paged KV-Cache Compression with Variable Compression Rates per Attention Head | Sep 30, 2024 | | CodeCode Available | 2 |
| Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models | Sep 17, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | Sep 25, 2024 | GPUQuantization | CodeCode Available | 2 |
| HTR-VT: Handwritten Text Recognition with Vision Transformer | Sep 13, 2024 | Handwritten Text RecognitionHTR | CodeCode Available | 2 |
| AST: Audio Spectrogram Transformer | Apr 5, 2021 | Audio ClassificationAudio Tagging | CodeCode Available | 2 |
| LangCoop: Collaborative Driving with Language | Apr 18, 2025 | Autonomous Driving | CodeCode Available | 2 |
| Token-level Direct Preference Optimization | Apr 18, 2024 | Diversity | CodeCode Available | 2 |
| The pitfalls of next-token prediction | Mar 11, 2024 | MambaMisconceptions | CodeCode Available | 2 |
| Exploring the Benefit of Activation Sparsity in Pre-training | Oct 4, 2024 | | CodeCode Available | 2 |
| MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization | Jul 14, 2025 | 2kImage Generation | CodeCode Available | 2 |
| ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image | Dec 12, 2023 | Image SegmentationInteractive Segmentation | CodeCode Available | 2 |
| ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE | Sep 12, 2024 | | CodeCode Available | 2 |
| Occupancy as Set of Points | Jul 4, 2024 | | CodeCode Available | 2 |
| True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning | Jan 25, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 2 |
| NeoBERT: A Next-Generation BERT | Feb 26, 2025 | In-Context LearningMTEB Benchmark | CodeCode Available | 2 |
| Geometric Transformer with Interatomic Positional Encoding | Sep 21, 2023 | | CodeCode Available | 2 |
| Learning A Spiking Neural Network for Efficient Image Deraining | May 10, 2024 | Image ReconstructionRain Removal | CodeCode Available | 2 |