| FlashAttention on a Napkin: A Diagrammatic Approach to Deep Learning IO-Awareness | Dec 4, 2024 | GPUQuantization | —Unverified | 0 |
| Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated Reasoning | Dec 4, 2024 | GPU | —Unverified | 0 |
| Unifying KV Cache Compression for Large Language Models with LeanKV | Dec 4, 2024 | GPUQuantization | —Unverified | 0 |
| CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning | Dec 4, 2024 | GPURepresentation Learning | —Unverified | 0 |
| SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection | Dec 3, 2024 | GPUImage Segmentation | CodeCode Available | 0 |
| Can't Slow me Down: Learning Robust and Hardware-Adaptive Object Detectors against Latency Attacks for Edge Devices | Dec 3, 2024 | Autonomous DrivingGPU | —Unverified | 0 |
| Improving feature interactions at Pinterest under industry constraints | Dec 2, 2024 | GPURecommendation Systems | —Unverified | 0 |
| Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control | Dec 2, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection | Dec 2, 2024 | Animal Pose EstimationGPU | —Unverified | 0 |
| Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification | Dec 2, 2024 | GPUQuantization | —Unverified | 0 |