| PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing | Oct 7, 2024 | GPU | CodeCode Available | 1 |
| CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation | Oct 7, 2024 | GPUMachine Translation | —Unverified | 0 |
| Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective | Oct 6, 2024 | CPUGPU | CodeCode Available | 1 |
| PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms | Oct 5, 2024 | BenchmarkingGPU | —Unverified | 0 |
| High-Speed Stereo Visual SLAM for Low-Powered Computing Devices | Oct 5, 2024 | GPU | CodeCode Available | 3 |
| Fast Object Detection with a Machine Learning Edge Device | Oct 5, 2024 | Autonomous NavigationCPU | —Unverified | 0 |
| Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning | Oct 4, 2024 | CPUDeep Learning | —Unverified | 0 |
| Compute Or Load KV Cache? Why Not Both? | Oct 4, 2024 | GPU | —Unverified | 0 |
| SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation | Oct 4, 2024 | 16kCode Generation | CodeCode Available | 3 |
| LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy | Oct 4, 2024 | GPULow-rank compression | —Unverified | 0 |