| GraphFM: A Comprehensive Benchmark for Graph Foundation Model | Jun 12, 2024 | GPUGraph Neural Network | CodeCode Available | 0 |
| ProTrain: Efficient LLM Training via Memory-Aware Techniques | Jun 12, 2024 | CPUGPU | —Unverified | 0 |
| PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models | Jun 11, 2024 | CPUGPU | —Unverified | 0 |
| Sustainable self-supervised learning for speech representations | Jun 11, 2024 | GPUSelf-Supervised Learning | —Unverified | 0 |
| Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models | Jun 11, 2024 | DiversityGPU | CodeCode Available | 2 |
| Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images | Jun 11, 2024 | BenchmarkingGPU | —Unverified | 0 |
| FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion | Jun 11, 2024 | GPU | CodeCode Available | 5 |
| VoxNeuS: Enhancing Voxel-Based Neural Surface Reconstruction via Gradient Interpolation | Jun 11, 2024 | GPUSurface Reconstruction | —Unverified | 0 |
| Low-Rank Quantization-Aware Training for LLMs | Jun 10, 2024 | GPUparameter-efficient fine-tuning | CodeCode Available | 2 |
| Label-Looping: Highly Efficient Decoding for Transducers | Jun 10, 2024 | GPUspeech-recognition | —Unverified | 0 |
| Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Jun 10, 2024 | 3D Semantic SegmentationComputed Tomography (CT) | CodeCode Available | 3 |
| Mamba YOLO: A Simple Baseline for Object Detection with State Space Model | Jun 9, 2024 | GPUMamba | CodeCode Available | 4 |
| TLCM: Training-efficient Latent Consistency Model for Image Generation with 2-8 Steps | Jun 9, 2024 | GPUImage Generation | CodeCode Available | 1 |
| Spectrum: Targeted Training on Signal to Noise Ratio | Jun 7, 2024 | GPU | CodeCode Available | 2 |
| MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter | Jun 7, 2024 | CPUGPU | CodeCode Available | 1 |
| Enhancing Large-Scale AI Training Efficiency: The C4 Solution for Real-Time Anomaly Detection and Communication Optimization | Jun 7, 2024 | Anomaly DetectionGPU | —Unverified | 0 |
| Quality-Diversity with Limited Resources | Jun 6, 2024 | DiversityGPU | CodeCode Available | 0 |
| ReDistill: Residual Encoded Distillation for Peak Memory Reduction | Jun 6, 2024 | DenoisingGPU | —Unverified | 0 |
| Global Parameterization-based Texture Space Optimization | Jun 6, 2024 | GPU | —Unverified | 0 |
| Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models on GPU | Jun 6, 2024 | GPUspeech-recognition | —Unverified | 0 |
| Latent Neural Operator for Solving Forward and Inverse PDE Problems | Jun 6, 2024 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image | Jun 6, 2024 | 3D Scene ReconstructionDepth Estimation | CodeCode Available | 3 |
| Queue management for slo-oriented large language model serving | Jun 5, 2024 | BlockingGPU | CodeCode Available | 1 |
| Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity | Jun 5, 2024 | GPUQuantization | —Unverified | 0 |
| A Flexible Recursive Network for Video Stereo Matching Based on Residual Estimation | Jun 5, 2024 | GPUStereo Matching | CodeCode Available | 0 |
| Searching Priors Makes Text-to-Video Synthesis Better | Jun 5, 2024 | GPU | —Unverified | 0 |
| A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection | Jun 5, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control | Jun 4, 2024 | Bandwidth ExtensionCPU | CodeCode Available | 2 |
| Scalable MatMul-free Language Modeling | Jun 4, 2024 | GPULanguage Modeling | CodeCode Available | 7 |
| Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning | Jun 4, 2024 | document understandingGPU | CodeCode Available | 1 |
| A Study of Optimizations for Fine-tuning Large Language Models | Jun 4, 2024 | GPU | —Unverified | 0 |
| Speeding up Policy Simulation in Supply Chain RL | Jun 4, 2024 | GPU | —Unverified | 0 |
| Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation | Jun 4, 2024 | Face SwappingGPU | CodeCode Available | 4 |
| LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing | Jun 4, 2024 | ClassificationGPU | CodeCode Available | 1 |
| SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM | Jun 3, 2024 | DecoderGPU | CodeCode Available | 2 |
| ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training | Jun 3, 2024 | Distributed OptimizationFederated Learning | CodeCode Available | 1 |
| GPU-Accelerated Rule Evaluation and Evolution | Jun 3, 2024 | Explainable artificial intelligenceGPU | —Unverified | 0 |
| OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models | Jun 3, 2024 | GPULanguage Modeling | —Unverified | 0 |
| D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models | Jun 3, 2024 | GPUMath | —Unverified | 0 |
| CE-NAS: An End-to-End Carbon-Efficient Neural Architecture Search Framework | Jun 3, 2024 | GPUNeural Architecture Search | —Unverified | 0 |
| Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow | Jun 3, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation | Jun 3, 2024 | GPUVideo Generation | CodeCode Available | 2 |
| RGFN: Synthesizable Molecular Generation Using GFlowNets | Jun 1, 2024 | GPU | CodeCode Available | 1 |
| Multi-Objective Neural Architecture Search by Learning Search Space Partitions | Jun 1, 2024 | Bayesian OptimizationGPU | —Unverified | 0 |
| AudioLCM: Text-to-Audio Generation with Latent Consistency Models | Jun 1, 2024 | Audio GenerationAudio Synthesis | CodeCode Available | 5 |
| Advancing Supervised Local Learning Beyond Classification with Long-term Feature Bank | Jun 1, 2024 | GPUimage-classification | —Unverified | 0 |
| μLO: Compute-Efficient Meta-Generalization of Learned Optimizers | May 31, 2024 | GPUZero-shot Generalization | CodeCode Available | 1 |
| S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs | May 30, 2024 | GPUQuantization | —Unverified | 0 |
| MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | May 30, 2024 | DenoisingGPU | CodeCode Available | 3 |
| Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback | May 30, 2024 | GPUKnowledge Graphs | —Unverified | 0 |