| SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications | Nov 7, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory | Oct 14, 2024 | BenchmarkingLarge Language Model | CodeCode Available | 3 |
| Baichuan-Omni Technical Report | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond | Oct 10, 2024 | Large Language ModelRecommendation Systems | CodeCode Available | 3 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale | Sep 25, 2024 | Large Language Model | CodeCode Available | 3 |
| LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale | Aug 10, 2024 | GPULanguage Modelling | CodeCode Available | 3 |
| OptiMUS-0.3: Using Large Language Models to Model and Solve Optimization Problems at Scale | Jul 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Odyssey: Empowering Minecraft Agents with Open-World Skills | Jul 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 3 |