| InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference | Sep 8, 2024 | Edge-computingGPU | —Unverified | 0 |
| From Computation to Consumption: Exploring the Compute-Energy Link for Training and Testing Neural Networks for SED Systems | Sep 8, 2024 | Audio TaggingEvent Detection | —Unverified | 0 |
| MultiCounter: Multiple Action Agnostic Repetition Counting in Untrimmed Videos | Sep 6, 2024 | GPURepetitive Action Counting | —Unverified | 0 |
| Confidential Computing on NVIDIA Hopper GPUs: A Performance Benchmark Study | Sep 6, 2024 | CPUGPU | —Unverified | 0 |
| Differentiable Discrete Event Simulation for Queuing Network Control | Sep 5, 2024 | GPUReinforcement Learning (RL) | —Unverified | 0 |
| LMLT: Low-to-high Multi-Level Vision Transformer for Image Super-Resolution | Sep 5, 2024 | GPUImage Super-Resolution | CodeCode Available | 1 |
| mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding | Sep 5, 2024 | document understandingGPU | —Unverified | 0 |
| Hardware Acceleration of LLMs: A comprehensive survey and comparison | Sep 5, 2024 | GPUSurvey | —Unverified | 0 |
| LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones | Sep 5, 2024 | CPUGPU | CodeCode Available | 1 |
| ISO: Overlap of Computation and Communication within Seqenence For LLM Inference | Sep 4, 2024 | GPULanguage Modeling | —Unverified | 0 |