| CrossOver: 3D Scene Cross-Modal Alignment | Feb 20, 2025 | cross-modal alignmentObject | CodeCode Available | 3 | 5 |
| Harnessing Multiple Large Language Models: A Survey on LLM Ensemble | Feb 25, 2025 | Survey | CodeCode Available | 3 | 5 |
| BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction | Feb 26, 2025 | BenchmarkingTime Series | CodeCode Available | 3 | 5 |
| GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving | Mar 7, 2025 | Autonomous DrivingDenoising | CodeCode Available | 3 | 5 |
| Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering | Mar 14, 2025 | Audio Question AnsweringQuestion Answering | CodeCode Available | 3 | 5 |
| Falcon: A Remote Sensing Vision-Language Foundation Model | Mar 14, 2025 | Image Captioningimage-classification | CodeCode Available | 3 | 5 |
| A Survey on Latent Reasoning | Jul 8, 2025 | Survey | CodeCode Available | 3 | 5 |
| Vision-Speech Models: Teaching Speech Models to Converse about Images | Mar 19, 2025 | parameter-efficient fine-tuning | CodeCode Available | 3 | 5 |
| Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency | Mar 26, 2025 | DenoisingScene Generation | CodeCode Available | 3 | 5 |
| Vision-to-Music Generation: A Survey | Mar 27, 2025 | multimodal generationMusic Generation | CodeCode Available | 3 | 5 |
| A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond | Mar 27, 2025 | Survey | CodeCode Available | 3 | 5 |
| AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents | Mar 31, 2025 | Image GenerationText to Image Generation | CodeCode Available | 3 | 5 |
| Perception-R1: Pioneering Perception Policy with Reinforcement Learning | Apr 10, 2025 | reinforcement-learningReinforcement Learning | CodeCode Available | 3 | 5 |
| Learning to Reason under Off-Policy Guidance | Apr 21, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 3 | 5 |
| RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation | Aug 21, 2024 | RAGRetrieval | CodeCode Available | 3 | 5 |
| DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning | Feb 27, 2024 | Code Generation | CodeCode Available | 3 | 5 |
| Causal-learn: Causal Discovery in Python | Jul 31, 2023 | Causal Discovery | CodeCode Available | 3 | 5 |
| Memory Layers at Scale | Dec 12, 2024 | | CodeCode Available | 3 | 5 |
| MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache | Jan 25, 2024 | GPUmodel | CodeCode Available | 3 | 5 |
| Addressing the Abstraction and Reasoning Corpus via Procedural Example Generation | Apr 10, 2024 | ARCDiversity | CodeCode Available | 3 | 5 |
| A Unified Framework for Rank-based Evaluation Metrics for Link Prediction in Knowledge Graphs | Mar 14, 2022 | BenchmarkingGraph Embedding | CodeCode Available | 3 | 5 |
| Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models | Mar 21, 2024 | | CodeCode Available | 3 | 5 |
| GiT: Towards Generalist Vision Transformer through Universal Language Interface | Mar 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Champion Solution for the WSDM2023 Toloka VQA Challenge | Jan 22, 2023 | Question AnsweringVisual Grounding | CodeCode Available | 3 | 5 |
| EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars | Apr 29, 2024 | | CodeCode Available | 3 | 5 |
| On Noise Injection in Generative Adversarial Networks | Jun 10, 2020 | Image Generation | CodeCode Available | 3 | 5 |
| When Large Language Models Meet Vector Databases: A Survey | Jan 30, 2024 | HallucinationInformation Retrieval | CodeCode Available | 3 | 5 |
| PyText: A Seamless Path from NLP research to production | Dec 12, 2018 | | CodeCode Available | 3 | 5 |
| Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog | Apr 11, 2021 | Semantic Parsing | CodeCode Available | 3 | 5 |
| Breaking reCAPTCHAv2 | Sep 13, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 3 | 5 |
| AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Jun 16, 2025 | Action GenerationAutonomous Driving | CodeCode Available | 3 | 5 |
| BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models | Aug 23, 2024 | Data Poisoningtext-classification | CodeCode Available | 3 | 5 |
| Deep learning in motion deblurring: current status, benchmarks and future prospects | Jan 10, 2024 | DeblurringDeep Learning | CodeCode Available | 3 | 5 |
| LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation | Mar 8, 2024 | Image SegmentationMamba | CodeCode Available | 3 | 5 |
| RT-1: Robotics Transformer for Real-World Control at Scale | Dec 13, 2022 | DiversityRobot Manipulation | CodeCode Available | 3 | 5 |
| AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs | Oct 3, 2024 | Red Teaming | CodeCode Available | 3 | 5 |
| SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation | Sep 29, 2023 | 3D Human Pose Estimation3D Human Reconstruction | CodeCode Available | 3 | 5 |
| Elucidating the Design Space of Multimodal Protein Language Models | Apr 15, 2025 | DiversityRepresentation Learning | CodeCode Available | 3 | 5 |
| Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D | Apr 19, 2025 | DecoderObject Localization | CodeCode Available | 3 | 5 |
| Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment | Dec 1, 2023 | Contrastive LearningFew-Shot Learning | CodeCode Available | 3 | 5 |
| Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification | Dec 6, 2023 | AllSpeaker Verification | CodeCode Available | 3 | 5 |
| CausalML: Python Package for Causal Machine Learning | Feb 25, 2020 | BIG-bench Machine LearningCausal Inference | CodeCode Available | 3 | 5 |
| Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model | Mar 12, 2024 | Image GenerationText to Image Generation | CodeCode Available | 3 | 5 |
| Evolve Cost-aware Acquisition Functions Using Large Language Models | Apr 25, 2024 | Bayesian OptimizationDecision Making | CodeCode Available | 3 | 5 |
| SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models | Jul 22, 2024 | Language Modeling | CodeCode Available | 3 | 5 |
| MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models | Oct 16, 2024 | DiagnosticHallucination | CodeCode Available | 3 | 5 |
| Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity | Mar 30, 2017 | Drug DiscoveryMolecular Docking | CodeCode Available | 3 | 5 |
| Personalize Segment Anything Model with One Shot | May 4, 2023 | Image Generationmodel | CodeCode Available | 3 | 5 |
| SimpleRecon: 3D Reconstruction Without 3D Convolutions | Aug 31, 2022 | 3D ReconstructionDepth Estimation | CodeCode Available | 3 | 5 |
| Cyber-Attack Technique Classification Using Two-Stage Trained Large Language Models | Nov 27, 2024 | ClassificationSentence | CodeCode Available | 3 | 5 |