| Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient | Nov 26, 2024 | GPUImage Generation | CodeCode Available | 2 |
| k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning | Nov 26, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Pushing the Limits of Large Language Model Quantization via the Linearity Theorem | Nov 26, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Automatic Skull Reconstruction by Deep Learnable Symmetry Enforcement | Nov 26, 2024 | GPU | —Unverified | 0 |
| Knowledge-aware Evolutionary Graph Neural Architecture Search | Nov 26, 2024 | GPUGraph Neural Network | CodeCode Available | 0 |
| KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation | Nov 26, 2024 | CPUGPU | CodeCode Available | 0 |
| ADAF: An Artificial Intelligence Data Assimilation Framework for Weather Forecasting | Nov 25, 2024 | GPUWeather Forecasting | CodeCode Available | 1 |
| SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE | Nov 25, 2024 | 3D GenerationGPU | —Unverified | 0 |
| A Data-Driven Approach to Dataflow-Aware Online Scheduling for Graph Neural Network Inference | Nov 25, 2024 | CPUGPU | —Unverified | 0 |
| Plastic Arbor: a modern simulation framework for synaptic plasticity x2013 from single synapses to networks of morphological neurons | Nov 25, 2024 | CPUGPU | CodeCode Available | 0 |
| MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking | Nov 24, 2024 | GPUImage Enhancement | —Unverified | 0 |
| MobileMamba: Lightweight Multi-Receptive Visual Mamba Network | Nov 24, 2024 | GPUMamba | CodeCode Available | 3 |
| Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format | Nov 24, 2024 | GPU | —Unverified | 0 |
| Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud | Nov 23, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Multi-scale Cascaded Large-Model for Whole-body ROI Segmentation | Nov 23, 2024 | Computational EfficiencyGPU | CodeCode Available | 0 |
| Reassessing Layer Pruning in LLMs: New Insights and Methods | Nov 23, 2024 | BenchmarkingGPU | CodeCode Available | 0 |
| Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data Processing | Nov 22, 2024 | Computational EfficiencyCPU | CodeCode Available | 3 |
| XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models | Nov 22, 2024 | GPU | CodeCode Available | 5 |
| Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers | Nov 22, 2024 | Data AugmentationGPU | —Unverified | 0 |
| Deep operator network models for predicting post-burn contraction | Nov 21, 2024 | CPUGPU | —Unverified | 0 |
| Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction | Nov 21, 2024 | 3D GenerationGPU | —Unverified | 0 |
| Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting | Nov 21, 2024 | GPU | —Unverified | 0 |
| Quantization without Tears | Nov 21, 2024 | GPUQuantization | CodeCode Available | 1 |
| FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting | Nov 20, 2024 | Dimensionality ReductionGPU | —Unverified | 0 |
| Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training | Nov 20, 2024 | GPU | —Unverified | 0 |