| QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models | Oct 13, 2023 | Computational EfficiencyGPU | CodeCode Available | 1 |
| QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models | Oct 12, 2023 | GPUQuantization | CodeCode Available | 1 |
| No Privacy Left Outside: On the (In-)Security of TEE-Shielded DNN Partition for On-Device ML | Oct 11, 2023 | GPUInference Attack | CodeCode Available | 1 |
| Sparse Fine-tuning for Inference Acceleration of Large Language Models | Oct 10, 2023 | CPUGPU | CodeCode Available | 1 |
| Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks | Oct 8, 2023 | BinarizationCPU | CodeCode Available | 1 |
| GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models | Oct 8, 2023 | GPUReinforcement Learning (RL) | CodeCode Available | 1 |
| Surgical Gym: A high-performance GPU-based platform for reinforcement learning with surgical robots | Oct 7, 2023 | Deep Reinforcement LearningGPU | CodeCode Available | 1 |
| Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs | Oct 3, 2023 | GPU | CodeCode Available | 1 |
| Label Supervised LLaMA Finetuning | Oct 2, 2023 | GPUnamed-entity-recognition | CodeCode Available | 1 |
| Training a Large Video Model on a Single Machine in a Day | Sep 28, 2023 | Action RecognitionCPU | CodeCode Available | 1 |