SOTAVerified|Agents Browse Leaderboard About Blog

GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 5629 papers

Title	Date	Tasks	Status	Hype
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints	Apr 15, 2025	GPUInference Optimization	CodeCode Available	4
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float	Apr 15, 2025	CPUGPU	CodeCode Available	4
LettuceDetect: A Hallucination Detection Framework for RAG Applications	Feb 24, 2025	8kGPU	CodeCode Available	4
Building reliable sim driving agents by scaling self-play	Feb 20, 2025	Autonomous VehiclesBenchmarking	CodeCode Available	4
KernelBench: Can LLMs Write Efficient GPU Kernels?	Feb 14, 2025	GPU	CodeCode Available	4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token	Jan 7, 2025	GPUVisual Question Answering (VQA)	CodeCode Available	4
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization	Dec 30, 2024	Audio GenerationGPU	CodeCode Available	4
SocialED: A Python Library for Social Event Detection	Dec 18, 2024	CPUEvent Detection	CodeCode Available	4
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models	Nov 7, 2024	GPUQuantization	CodeCode Available	4
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads	Oct 14, 2024	GPUQuantization	CodeCode Available	4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts	Oct 9, 2024	GPUMixture-of-Experts	CodeCode Available	4
Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding	Sep 22, 2024	Anomaly DetectionGPU	CodeCode Available	4
EmbodiedSAM: Online Segment Any 3D Thing in Real Time	Aug 21, 2024	3D Instance SegmentationGPU	CodeCode Available	4
Deep Patch Visual SLAM	Aug 3, 2024	GPUVisual Odometry	CodeCode Available	4
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS	Aug 2, 2024	GPUNavigate	CodeCode Available	4
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals	Jul 18, 2024	Experimental DesignGPU	CodeCode Available	4
fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence	Jul 1, 2024	GPUPoint cloud reconstruction	CodeCode Available	4
On Scaling Up 3D Gaussian Splatting Training	Jun 26, 2024	3DGS3D Reconstruction	CodeCode Available	4
Mamba YOLO: A Simple Baseline for Object Detection with State Space Model	Jun 9, 2024	GPUMamba	CodeCode Available	4
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation	Jun 4, 2024	Face SwappingGPU	CodeCode Available	4
Looking Backward: Streaming Video-to-Video Translation with Feature Banks	May 24, 2024	GPUTranslation	CodeCode Available	4
Vidur: A Large-Scale Simulation Framework For LLM Inference	May 8, 2024	CPUGPU	CodeCode Available	4
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving	May 7, 2024	GPULanguage Modelling	CodeCode Available	4
Mamba-FETrack: Frame-Event Tracking via State Space Model	Apr 28, 2024	GPUMamba	CodeCode Available	4
JetMoE: Reaching Llama2 Performance with 0.1M Dollars	Apr 11, 2024	GPUMixture-of-Experts	CodeCode Available	4

Show:10 25 50

← PrevPage 4 of 226Next →

No leaderboard results yet.