| Fast On-device LLM Inference with NPUs | Jul 8, 2024 | CPUGPU | CodeCode Available | 5 |
| XFeat: Accelerated Features for Lightweight Image Matching | Apr 30, 2024 | CPUKeypoint detection and image matching | CodeCode Available | 5 |
| Extreme Compression of Large Language Models via Additive Quantization | Jan 11, 2024 | CPUGPU | CodeCode Available | 5 |
| PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU | Dec 16, 2023 | CPUGPU | CodeCode Available | 5 |
| Faster Segment Anything: Towards Lightweight SAM for Mobile Applications | Jun 25, 2023 | CPUDecoder | CodeCode Available | 5 |
| FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU | Mar 13, 2023 | CPUGPU | CodeCode Available | 5 |
| Vectorized and performance-portable Quicksort | May 12, 2022 | CPU | CodeCode Available | 5 |
| 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float | Apr 15, 2025 | CPUGPU | CodeCode Available | 4 |
| SocialED: A Python Library for Social Event Detection | Dec 18, 2024 | CPUEvent Detection | CodeCode Available | 4 |
| InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems | Oct 21, 2024 | Automated Theorem ProvingCPU | CodeCode Available | 4 |