| Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition | Feb 20, 2025 | | CodeCode Available | 3 |
| UniVS: Unified and Universal Video Segmentation with Prompts as Queries | Feb 28, 2024 | DecoderReferring Expression Segmentation | CodeCode Available | 3 |
| PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling | Dec 18, 2024 | One-Shot Learning | CodeCode Available | 3 |
| U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection | May 18, 2020 | Dichotomous Image SegmentationGPU | CodeCode Available | 3 |
| Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve | Mar 4, 2024 | GPUScheduling | CodeCode Available | 3 |
| Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models | Sep 7, 2024 | ChunkingRetrieval | CodeCode Available | 3 |
| Robust High-Resolution Video Matting with Temporal Guidance | Aug 25, 2021 | 4kGPU | CodeCode Available | 3 |
| DeepInteraction++: Multi-Modality Interaction for Autonomous Driving | Aug 9, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 |
| A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models | Jul 2, 2024 | Navigate | CodeCode Available | 3 |
| CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents | Jul 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| TristouNet: Triplet Loss for Speaker Turn Embedding | Sep 14, 2016 | Change DetectionTriplet | CodeCode Available | 3 |
| A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild | Aug 23, 2020 | AllMORPH | CodeCode Available | 3 |
| LION: Linear Group RNN for 3D Object Detection in Point Clouds | Jul 25, 2024 | 3D Object DetectionLong-range modeling | CodeCode Available | 3 |
| Pandora3D: A Comprehensive Framework for High-Quality 3D Shape and Texture Generation | Feb 20, 2025 | 3D Shape GenerationTexture Synthesis | CodeCode Available | 3 |
| Robust and Accurate Object Detection via Adversarial Learning | Mar 23, 2021 | AutoMLData Augmentation | CodeCode Available | 3 |
| LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture | Sep 4, 2024 | GPUMamba | CodeCode Available | 3 |
| MultiModal-GPT: A Vision and Language Model for Dialogue with Humans | May 8, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| Behavior Generation with Latent Actions | Mar 5, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 3 |
| OmniAudio: Generating Spatial Audio from 360-Degree Video | Apr 21, 2025 | Audio Generation | CodeCode Available | 3 |
| TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON | Jul 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Ludwig: a type-based declarative deep learning toolbox | Sep 17, 2019 | DecoderDeep Learning | CodeCode Available | 3 |
| Comparison of Syntactic and Semantic Representations of Programs in Neural Embeddings | Jan 24, 2020 | Program Synthesis | CodeCode Available | 3 |
| WantWords: An Open-source Online Reverse Dictionary System | Oct 1, 2020 | Reverse Dictionary | CodeCode Available | 3 |
| Implicit Style-Content Separation using B-LoRA | Mar 21, 2024 | Image StylizationStyle Transfer | CodeCode Available | 3 |
| EfficientQAT: Efficient Quantization-Aware Training for Large Language Models | Jul 10, 2024 | GPUQuantization | CodeCode Available | 3 |
| N-LTP: An Open-source Neural Language Technology Platform for Chinese | Sep 24, 2020 | Chinese Word SegmentationDependency Parsing | CodeCode Available | 3 |
| AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents | Jan 24, 2024 | Benchmarking | CodeCode Available | 3 |
| EfficientDet: Scalable and Efficient Object Detection | Nov 20, 2019 | AutoMLObject | CodeCode Available | 3 |
| Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning | Jul 8, 2020 | AutoMLBIG-bench Machine Learning | CodeCode Available | 3 |
| LongRoPE2: Near-Lossless LLM Context Window Scaling | Feb 27, 2025 | | CodeCode Available | 3 |
| A Novel Non-population-based Meta-heuristic Optimizer Inspired by the Philosophy of Yi Jing | Apr 17, 2021 | Philosophy | CodeCode Available | 3 |
| CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving | May 15, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 3 |
| Practical Video Object Detection via Feature Selection and Aggregation | Jul 29, 2024 | feature selectionGPU | CodeCode Available | 3 |
| LIMR: Less is More for RL Scaling | Feb 17, 2025 | | CodeCode Available | 3 |
| WebCanvas: Benchmarking Web Agents in Online Environments | Jun 18, 2024 | AI AgentBenchmarking | CodeCode Available | 3 |
| ptwt - The PyTorch Wavelet Toolbox | Mar 1, 2024 | | CodeCode Available | 3 |
| UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling | Aug 9, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions | Mar 2, 2024 | Neural Architecture Search | CodeCode Available | 3 |
| Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution | Apr 13, 2025 | GSM8KMath | CodeCode Available | 3 |
| MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation | May 15, 2025 | Image AnimationVideo Generation | CodeCode Available | 3 |
| mlpack 3: a fast, flexible machine learning library | Jun 18, 2018 | BenchmarkingBIG-bench Machine Learning | CodeCode Available | 3 |
| Large Language Model-Brained GUI Agents: A Survey | Nov 27, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| Rethinking Histology Slide Digitization Workflows for Low-Resource Settings | May 13, 2024 | Deblurringwhole slide images | CodeCode Available | 3 |
| Allo: A Programming Model for Composable Accelerator Design | Apr 7, 2024 | GPUHigh-Level Synthesis | CodeCode Available | 3 |
| CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations | Feb 6, 2024 | Visual Reasoning | CodeCode Available | 3 |
| Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context | Mar 8, 2024 | 1 Image, 2*2 StitchingCode Generation | CodeCode Available | 3 |
| ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference | Oct 28, 2024 | CPU | CodeCode Available | 3 |
| PyTorch Metric Learning | Aug 20, 2020 | Metric Learning | CodeCode Available | 3 |
| ReasonIR: Training Retrievers for Reasoning Tasks | Apr 29, 2025 | Information RetrievalMMLU | CodeCode Available | 3 |
| OCR-free Document Understanding Transformer | Nov 30, 2021 | Document Image Classificationdocument understanding | CodeCode Available | 3 |