| TerraTorch: The Geospatial Foundation Models Toolkit | Mar 26, 2025 | BenchmarkingDecoder | CodeCode Available | 4 |
| Video-R1: Reinforcing Video Reasoning in MLLMs | Mar 27, 2025 | MVBenchReinforcement Learning (RL) | CodeCode Available | 4 |
| SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement | Jun 9, 2025 | Music Generation | CodeCode Available | 4 |
| SpatialTrackerV2: 3D Point Tracking Made Easy | Jul 16, 2025 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 4 |
| Proactive Detection of Voice Cloning with Localized Watermarking | Jan 30, 2024 | Voice Cloning | CodeCode Available | 4 |
| Eliciting Latent Predictions from Transformers with the Tuned Lens | Mar 14, 2023 | Language Modelling | CodeCode Available | 4 |
| REFINE: Inversion-Free Backdoor Defense via Model Reprogramming | Feb 22, 2025 | backdoor defense | CodeCode Available | 4 |
| Relationships are Complicated! An Analysis of Relationships Between Datasets on the Web | Aug 26, 2024 | Decision MakingMulti-class Classification | CodeCode Available | 4 |
| Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets | Mar 9, 2022 | BenchmarkingGraph Regression | CodeCode Available | 4 |
| LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment | Oct 3, 2023 | Audio ClassificationContrastive Learning | CodeCode Available | 4 |
| SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement | Oct 26, 2024 | Large Language Model | CodeCode Available | 4 |
| R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization | Mar 13, 2025 | Multimodal Reasoning | CodeCode Available | 4 |
| Recurrent Partial Kernel Network for Efficient Optical Flow Estimation | Feb 1, 2024 | Optical Flow Estimation | CodeCode Available | 4 |
| DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality | Oct 25, 2022 | Deep Reinforcement LearningGPU | CodeCode Available | 4 |
| Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning | Mar 20, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 4 |
| Are Transformers Effective for Time Series Forecasting? | May 26, 2022 | Anomaly DetectionRelation Extraction | CodeCode Available | 4 |
| Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models | May 8, 2025 | Multimodal Reasoning | CodeCode Available | 4 |
| Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Dec 4, 2023 | Depth EstimationGPU | CodeCode Available | 4 |
| SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot | Jan 2, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 4 |
| AlignScore: Evaluating Factual Consistency with a Unified Alignment Function | May 26, 2023 | Fact VerificationInformation Retrieval | CodeCode Available | 4 |
| TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos | Mar 26, 2024 | 3D Human Pose Estimation | CodeCode Available | 4 |
| TableGPT2: A Large Multimodal Model with Tabular Data Integration | Nov 4, 2024 | BenchmarkingData Integration | CodeCode Available | 4 |
| Human-Humanoid Robots Cross-Embodiment Behavior-Skill Transfer Using Decomposed Adversarial Learning from Demonstration | Dec 19, 2024 | Human-Object Interaction Detectionmotion retargeting | CodeCode Available | 4 |
| FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds | Jul 1, 2024 | Audio GenerationVideo Alignment | CodeCode Available | 4 |
| MovieChat+: Question-aware Sparse Memory for Long Video Question Answering | Apr 26, 2024 | 2kQuestion Answering | CodeCode Available | 4 |
| Knowledge Fusion of Chat LLMs: A Preliminary Technical Report | Feb 25, 2024 | | CodeCode Available | 4 |
| R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | May 22, 2025 | MemorizationRAG | CodeCode Available | 4 |
| The case for 4-bit precision: k-bit Inference Scaling Laws | Dec 19, 2022 | Quantization | CodeCode Available | 4 |
| ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object Detection | Feb 5, 2024 | 3D Object DetectionActive Learning | CodeCode Available | 4 |
| DepGraph: Towards Any Structural Pruning | Jan 30, 2023 | Network PruningNeural Network Compression | CodeCode Available | 4 |
| Improving Training Stability for Multitask Ranking Models in Recommender Systems | Feb 17, 2023 | Recommendation Systems | CodeCode Available | 4 |
| High-Resolution Image Synthesis with Latent Diffusion Models | Dec 20, 2021 | DenoisingGPU | CodeCode Available | 4 |
| How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources | Jun 7, 2023 | Instruction Following | CodeCode Available | 4 |
| Decoder Tuning: Efficient Language Understanding as Decoding | Dec 16, 2022 | DecoderNatural Language Understanding | CodeCode Available | 4 |
| Programming Is Hard -- Or at Least It Used to Be: Educational Opportunities And Challenges of AI Code Generation | Dec 2, 2022 | Code GenerationPosition | CodeCode Available | 4 |
| The CLRS-Text Algorithmic Reasoning Language Benchmark | Jun 6, 2024 | | CodeCode Available | 4 |
| PointMamba: A Simple State Space Model for Point Cloud Analysis | Feb 16, 2024 | GPUMamba | CodeCode Available | 4 |
| Reducing Activation Recomputation in Large Transformer Models | May 10, 2022 | | CodeCode Available | 4 |
| Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation | Feb 28, 2024 | AttributeExtractive Question-Answering | CodeCode Available | 4 |
| ReChorus2.0: A Modular and Task-Flexible Recommendation Library | May 28, 2024 | Click-Through Rate PredictionRecommendation Systems | CodeCode Available | 4 |
| FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects | Dec 13, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 4 |
| ChangeMamba: Remote Sensing Change Detection With Spatiotemporal State Space Model | Apr 4, 2024 | 2D Semantic SegmentationAttribute | CodeCode Available | 4 |
| Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach | Feb 7, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Mar 11, 2024 | 2D Object Detection2k | CodeCode Available | 4 |
| SNAC: Multi-Scale Neural Audio Codec | Oct 18, 2024 | Audio CompressionAudio Generation | CodeCode Available | 4 |
| BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions | Jun 22, 2024 | BenchmarkingCode Generation | CodeCode Available | 4 |
| Boximator: Generating Rich and Controllable Motions for Video Synthesis | Feb 2, 2024 | | CodeCode Available | 4 |
| Phoenix: Democratizing ChatGPT across Languages | Apr 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Blendify -- Python rendering framework for Blender | Oct 23, 2024 | 10-shot image generation | CodeCode Available | 4 |
| Benchmarking Retrieval-Augmented Generation for Medicine | Feb 20, 2024 | BenchmarkingInformation Retrieval | CodeCode Available | 4 |