| GPT Understands, Too | Mar 18, 2021 | Knowledge ProbingLanguage Modeling | CodeCode Available | 2 | 5 |
| TransTab: Learning Transferable Tabular Transformers Across Tables | May 19, 2022 | Incremental LearningTransfer Learning | CodeCode Available | 2 | 5 |
| VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting | Mar 25, 2024 | Mamba | CodeCode Available | 2 | 5 |
| Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models | Jun 5, 2024 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 | 5 |
| MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models | Oct 15, 2024 | | CodeCode Available | 2 | 5 |
| GPD-1: Generative Pre-training for Driving | Dec 11, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 | 5 |
| Attacking and Defending Machine Learning Applications of Public Cloud | Jul 27, 2020 | Adversarial AttackBIG-bench Machine Learning | CodeCode Available | 2 | 5 |
| ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model | Apr 13, 2025 | DiagnosticLanguage Modeling | CodeCode Available | 2 | 5 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 | 5 |
| PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings | Jul 28, 2020 | Graph EmbeddingKnowledge Graph Embedding | CodeCode Available | 2 | 5 |
| Dense Connector for MLLMs | May 22, 2024 | Video Understanding | CodeCode Available | 2 | 5 |
| Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration | May 30, 2023 | Drug Design | CodeCode Available | 2 | 5 |
| Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives | Nov 7, 2024 | Large Language Model | CodeCode Available | 2 | 5 |
| Causal Diffusion Transformers for Generative Modeling | Dec 16, 2024 | DecoderImage Generation | CodeCode Available | 2 | 5 |
| Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference | May 28, 2024 | GPUText Generation | CodeCode Available | 2 | 5 |
| Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement | Mar 11, 2024 | Clinical KnowledgeDescriptive | CodeCode Available | 2 | 5 |
| Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation | Jul 3, 2025 | DiversityVideo Generation | CodeCode Available | 2 | 5 |
| When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning | Mar 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| When Do LLMs Help With Node Classification? A Comprehensive Analysis | Feb 2, 2025 | Node Classification | CodeCode Available | 2 | 5 |
| GhostNetV2: Enhance Cheap Operation with Long-Range Attention | Nov 23, 2022 | | CodeCode Available | 2 | 5 |
| A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection | Mar 9, 2022 | Co-Salient Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| Atlas: End-to-End 3D Scene Reconstruction from Posed Images | Mar 23, 2020 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 2 | 5 |
| Federated Learning in Mobile Networks: A Comprehensive Case Study on Traffic Forecasting | Dec 5, 2024 | Federated LearningManagement | CodeCode Available | 2 | 5 |
| Toward Automated Algorithm Design: A Survey and Practical Guide to Meta-Black-Box-Optimization | Nov 1, 2024 | Computational EfficiencyIn-Context Learning | CodeCode Available | 2 | 5 |
| MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models | May 15, 2025 | General KnowledgePrompt Engineering | CodeCode Available | 2 | 5 |
| SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network | May 16, 2024 | Binary ClassificationDecoder | CodeCode Available | 2 | 5 |
| Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation | Mar 29, 2022 | CPUDecoder | CodeCode Available | 2 | 5 |
| Are Self-Attentions Effective for Time Series Forecasting? | May 27, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 | 5 |
| Autoformalizing Euclidean Geometry | May 27, 2024 | Math | CodeCode Available | 2 | 5 |
| HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation | Nov 19, 2024 | Domain AdaptationImage Generation | CodeCode Available | 2 | 5 |
| Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction | Apr 21, 2025 | Math | CodeCode Available | 2 | 5 |
| HybridNets: End-to-End Perception Network | Mar 17, 2022 | Autonomous DrivingDrivable Area Detection | CodeCode Available | 2 | 5 |
| Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training | May 23, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) | Mar 24, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| D-Flow: Differentiating through Flows for Controlled Generation | Feb 21, 2024 | | CodeCode Available | 2 | 5 |
| REACT: Real-time Efficiency and Accuracy Compromise for Tradeoffs in Scene Graph Generation | May 25, 2024 | Graph GenerationObject | CodeCode Available | 2 | 5 |
| Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis | Mar 25, 2025 | Contrastive LearningImage-text Retrieval | CodeCode Available | 2 | 5 |
| MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Jun 5, 2025 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| Cross-video Identity Correlating for Person Re-identification Pre-training | Sep 27, 2024 | DenoisingPerson Re-Identification | CodeCode Available | 2 | 5 |
| Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion | Oct 19, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| FCN: Fusing Exponential and Linear Cross Network for Click-Through Rate Prediction | Jul 18, 2024 | Click-Through Rate Prediction | CodeCode Available | 2 | 5 |
| SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation | Apr 23, 2024 | 3D Human Pose EstimationPose Estimation | CodeCode Available | 2 | 5 |
| Wavelet-based Mamba with Fourier Adjustment for Low-light Image Enhancement | Oct 27, 2024 | DecoderImage Enhancement | CodeCode Available | 2 | 5 |
| Learning Vision from Models Rivals Learning Vision from Data | Dec 28, 2023 | Contrastive LearningImage Captioning | CodeCode Available | 2 | 5 |
| Enhancing Retrieval-Augmented Generation: A Study of Best Practices | Jan 13, 2025 | In-Context LearningRAG | CodeCode Available | 2 | 5 |
| A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems | Jun 26, 2024 | Audio Source SeparationDecoder | CodeCode Available | 2 | 5 |
| ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation | Aug 13, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search | Mar 26, 2025 | Decision MakingRAG | CodeCode Available | 2 | 5 |
| Correlation Matching Transformation Transformers for UHD Image Restoration | Jun 2, 2024 | DeblurringImage Deblurring | CodeCode Available | 2 | 5 |
| Me LLaMA: Foundation Large Language Models for Medical Applications | Feb 20, 2024 | Few-Shot LearningGPU | CodeCode Available | 2 | 5 |