| DHIL-GT: Scalable Graph Transformer with Decoupled Hierarchy Labeling | Dec 6, 2024 | GPU | —Unverified | 0 |
| Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference | Dec 6, 2024 | GPULanguage Modeling | —Unverified | 0 |
| SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization | Dec 5, 2024 | ClusteringGPU | —Unverified | 0 |
| Assessing and Learning Alignment of Unimodal Vision and Language Models | Dec 5, 2024 | GPUSemantic Segmentation | —Unverified | 0 |
| CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning | Dec 4, 2024 | GPURepresentation Learning | —Unverified | 0 |
| Unifying KV Cache Compression for Large Language Models with LeanKV | Dec 4, 2024 | GPUQuantization | —Unverified | 0 |
| Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated Reasoning | Dec 4, 2024 | GPU | —Unverified | 0 |
| FlashAttention on a Napkin: A Diagrammatic Approach to Deep Learning IO-Awareness | Dec 4, 2024 | GPUQuantization | —Unverified | 0 |
| SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection | Dec 3, 2024 | GPUImage Segmentation | CodeCode Available | 0 |
| Can't Slow me Down: Learning Robust and Hardware-Adaptive Object Detectors against Latency Attacks for Edge Devices | Dec 3, 2024 | Autonomous DrivingGPU | —Unverified | 0 |
| Improving feature interactions at Pinterest under industry constraints | Dec 2, 2024 | GPURecommendation Systems | —Unverified | 0 |
| MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection | Dec 2, 2024 | Animal Pose EstimationGPU | —Unverified | 0 |
| Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification | Dec 2, 2024 | GPUQuantization | —Unverified | 0 |
| Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control | Dec 2, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| BlendPCR: Seamless and Efficient Rendering of Dynamic Point Clouds captured by Multiple RGB-D Cameras | Dec 1, 2024 | GPUNeRF | CodeCode Available | 0 |
| HT-HEDL: High-Throughput Hypothesis Evaluation in Description Logic | Dec 1, 2024 | CPUGPU | —Unverified | 0 |
| SPILDL: A Scalable and Parallel Inductive Learner in Description Logic | Dec 1, 2024 | CPUGPU | —Unverified | 0 |
| PAL -- Parallel active learning for machine-learned potentials | Nov 30, 2024 | Active LearningCPU | CodeCode Available | 0 |
| Open source Differentiable ODE Solving Infrastructure | Nov 29, 2024 | GPU | —Unverified | 0 |
| Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing | Nov 29, 2024 | AllForm | —Unverified | 0 |
| BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching | Nov 29, 2024 | GPUManagement | —Unverified | 0 |
| A Simple Sparse Matrix Vector Multiplication Approach to Padded Convolution | Nov 29, 2024 | CPUGPU | CodeCode Available | 0 |
| An Integrated Artificial Intelligence Operating System for Advanced Low-Altitude Aviation Applications | Nov 28, 2024 | Computational EfficiencyCPU | —Unverified | 0 |
| Puzzle: Distillation-Based NAS for Inference-Optimized LLMs | Nov 28, 2024 | GPUKnowledge Distillation | —Unverified | 0 |
| Differentiable Topology Estimating from Curvatures for 3D Shapes | Nov 28, 2024 | GPUTopological Data Analysis | —Unverified | 0 |