| TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs | Jan 9, 2025 | CPUDeep Reinforcement Learning | —Unverified | 0 |
| Decentralized Diffusion Models | Jan 9, 2025 | GPU | —Unverified | 0 |
| iServe: An Intent-based Serving System for LLMs | Jan 8, 2025 | GPU | —Unverified | 0 |
| Scaling Large Language Model Training on Frontier with Low-Bandwidth Partitioning | Jan 8, 2025 | GPULanguage Modeling | —Unverified | 0 |
| asanAI: In-Browser, No-Code, Offline-First Machine Learning Toolkit | Jan 7, 2025 | GPU | —Unverified | 0 |
| LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token | Jan 7, 2025 | GPUVisual Question Answering (VQA) | CodeCode Available | 4 |
| A GPU Implementation of Multi-Guiding Spark Fireworks Algorithm for Efficient Black-Box Neural Network Optimization | Jan 7, 2025 | Computational EfficiencyCPU | CodeCode Available | 0 |
| mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training | Jan 7, 2025 | BlockingGPU | —Unverified | 0 |
| The Artificial Scientist -- in-transit Machine Learning of Plasma Simulations | Jan 6, 2025 | GPU | —Unverified | 0 |
| Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments | Jan 6, 2025 | GPU | —Unverified | 0 |
| TAPAS: Thermal- and Power-Aware Scheduling for LLM Inference in Cloud Platforms | Jan 5, 2025 | GPUQuantization | —Unverified | 0 |
| LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations | Jan 5, 2025 | GPU | CodeCode Available | 1 |
| DeServe: Towards Affordable Offline LLM Inference via Decentralization | Jan 4, 2025 | GPULanguage Modeling | —Unverified | 0 |
| RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging Radar | Jan 4, 2025 | 3D Object Detection3D Object Detection (RoI) | CodeCode Available | 1 |
| The Race to Efficiency: A New Perspective on AI Scaling Laws | Jan 4, 2025 | GPU | —Unverified | 0 |
| Operator Learning for Reconstructing Flow Fields from Sparse Measurements: an Energy Transformer Approach | Jan 2, 2025 | GeophysicsGPU | —Unverified | 0 |
| FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving | Jan 2, 2025 | GPUScheduling | CodeCode Available | 9 |
| FED: Fast and Efficient Dataset Deduplication Framework with GPU Acceleration | Jan 2, 2025 | CPUGPU | CodeCode Available | 0 |
| Mamba4D: Efficient 4D Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models | Jan 1, 2025 | Action RecognitionAction Segmentation | CodeCode Available | 1 |
| FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting | Jan 1, 2025 | 3DGSGPU | —Unverified | 0 |
| DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing | Jan 1, 2025 | 3D scene EditingAttribute | —Unverified | 0 |
| Efficient Video Super-Resolution for Real-time Rendering with Decoupled G-buffer Guidance | Jan 1, 2025 | GPUSuper-Resolution | —Unverified | 0 |
| AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction | Jan 1, 2025 | GPUQuestion Answering | —Unverified | 0 |
| Building Vision Models upon Heat Conduction | Jan 1, 2025 | GPU | —Unverified | 0 |
| Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy | Jan 1, 2025 | GPURepresentation Learning | —Unverified | 0 |
| Dataset Distillation with Neural Characteristic Function: A Minmax Perspective | Jan 1, 2025 | Computational EfficiencyDataset Distillation | CodeCode Available | 3 |
| Higher-Order Ratio Cycles for Fast and Globally Optimal Shape Matching | Jan 1, 2025 | GPUImage Segmentation | CodeCode Available | 0 |
| ICP: Immediate Compensation Pruning for Mid-to-high Sparsity | Jan 1, 2025 | GPU | —Unverified | 0 |
| Minimal Interaction Seperated Tuning: A New Paradigm for Visual Adaptation | Jan 1, 2025 | CPUGPU | —Unverified | 0 |
| IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently | Jan 1, 2025 | GPU | —Unverified | 0 |
| AttriReBoost: A Gradient-Free Propagation Optimization Method for Cold Start Mitigation in Attribute Missing Graphs | Jan 1, 2025 | AttributeComputational Efficiency | CodeCode Available | 0 |
| Adjoint sharding for very long context training of state space models | Jan 1, 2025 | GPULarge Language Model | —Unverified | 0 |
| Towards Sustainable Large Language Model Serving | Dec 31, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Lightweight G-YOLOv11: Advancing Efficient Fracture Detection in Pediatric Wrist X-rays | Dec 31, 2024 | Fracture detectionGPU | CodeCode Available | 1 |
| Debunking the CUDA Myth Towards GPU-based AI Systems | Dec 31, 2024 | GPU | —Unverified | 0 |
| LTX-Video: Realtime Video Latent Diffusion | Dec 30, 2024 | DenoisingGPU | CodeCode Available | 9 |
| FastCHGNet: Training one Universal Interatomic Potential to 1.5 Hours with 32 GPUs | Dec 30, 2024 | GPUGraph Neural Network | —Unverified | 0 |
| Efficient Multi-Task Inferencing with a Shared Backbone and Lightweight Task-Specific Adapters for Automatic Scoring | Dec 30, 2024 | FairnessGPU | —Unverified | 0 |
| TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization | Dec 30, 2024 | Audio GenerationGPU | CodeCode Available | 4 |
| FPGA-based Acceleration of Neural Network for Image Classification using Vitis AI | Dec 30, 2024 | 3D ReconstructionCPU | —Unverified | 0 |
| IMSSA: Deploying modern state-space models on memristive in-memory compute hardware | Dec 28, 2024 | GPUQuantization | —Unverified | 0 |
| MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing | Dec 28, 2024 | GPUMamba | —Unverified | 0 |
| Towards Ideal Temporal Graph Neural Networks: Evaluations and Conclusions after 10,000 GPU Hours | Dec 28, 2024 | BenchmarkingGPU | —Unverified | 0 |
| Pushing the Envelope of Low-Bit LLM via Dynamic Error Compensation | Dec 28, 2024 | CPUGPU | —Unverified | 0 |
| LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System | Dec 28, 2024 | GPUManagement | —Unverified | 0 |
| Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms | Dec 27, 2024 | CPUGPU | —Unverified | 0 |
| Learning to Forget: Bayesian Time Series Forecasting using Recurrent Sparse Spectrum Signature Gaussian Processes | Dec 27, 2024 | Gaussian ProcessesGPU | —Unverified | 0 |
| Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design | Dec 27, 2024 | GPU | —Unverified | 0 |
| RAIN: Real-time Animation of Infinite Video Stream | Dec 27, 2024 | DenoisingGPU | —Unverified | 0 |
| MBQ: Modality-Balanced Quantization for Large Vision-Language Models | Dec 27, 2024 | GPUQuantization | CodeCode Available | 2 |