| InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference | Sep 8, 2024 | Edge-computingGPU | —Unverified | 0 |
| From Computation to Consumption: Exploring the Compute-Energy Link for Training and Testing Neural Networks for SED Systems | Sep 8, 2024 | Audio TaggingEvent Detection | —Unverified | 0 |
| MultiCounter: Multiple Action Agnostic Repetition Counting in Untrimmed Videos | Sep 6, 2024 | GPURepetitive Action Counting | —Unverified | 0 |
| Confidential Computing on NVIDIA Hopper GPUs: A Performance Benchmark Study | Sep 6, 2024 | CPUGPU | —Unverified | 0 |
| mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding | Sep 5, 2024 | document understandingGPU | —Unverified | 0 |
| Differentiable Discrete Event Simulation for Queuing Network Control | Sep 5, 2024 | GPUReinforcement Learning (RL) | —Unverified | 0 |
| Hardware Acceleration of LLMs: A comprehensive survey and comparison | Sep 5, 2024 | GPUSurvey | —Unverified | 0 |
| Hallucination Detection in LLMs: Fast and Memory-Efficient Fine-Tuned Models | Sep 4, 2024 | GPUHallucination | CodeCode Available | 0 |
| ISO: Overlap of Computation and Communication within Seqenence For LLM Inference | Sep 4, 2024 | GPULanguage Modeling | —Unverified | 0 |
| AdvSecureNet: A Python Toolkit for Adversarial Machine Learning | Sep 4, 2024 | GPU | CodeCode Available | 0 |