GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 5629 papers

Title	Date	Tasks	Status	Hype
XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models	Nov 22, 2024	GPU	CodeCode Available	5
FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation	Oct 16, 2024	Audio GenerationGPU	CodeCode Available	5
KBLaM: Knowledge Base augmented Language Model	Oct 14, 2024	8kGPU	CodeCode Available	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5
Fast On-device LLM Inference with NPUs	Jul 8, 2024	CPUGPU	CodeCode Available	5
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion	Jun 11, 2024	GPU	CodeCode Available	5
AudioLCM: Text-to-Audio Generation with Latent Consistency Models	Jun 1, 2024	Audio GenerationAudio Synthesis	CodeCode Available	5
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning	Feb 29, 2024	GPULanguage Modeling	CodeCode Available	5
Extreme Compression of Large Language Models via Additive Quantization	Jan 11, 2024	CPUGPU	CodeCode Available	5
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU	Dec 16, 2023	CPUGPU	CodeCode Available	5
LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models	Nov 8, 2023	8kGPU	CodeCode Available	5
ReLoRA: High-Rank Training Through Low-Rank Updates	Jul 11, 2023	GPU	CodeCode Available	5
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications	Jun 25, 2023	CPUDecoder	CodeCode Available	5
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU	Mar 13, 2023	CPUGPU	CodeCode Available	5
EfficientRep:An Efficient Repvgg-style ConvNets with Hardware-aware Neural Network Design	Feb 1, 2023	GPUobject-detection	CodeCode Available	5
YOLOv6 v3.0: A Full-Scale Reloading	Jan 13, 2023	GPUObject Detection	CodeCode Available	5
Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments	Jan 10, 2023	GPUImitation Learning	CodeCode Available	5
Point-E: A System for Generating 3D Point Clouds from Complex Prompts	Dec 16, 2022	Generating 3D Point CloudsGPU	CodeCode Available	5
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5
YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications	Sep 7, 2022	GPUObject Detection	CodeCode Available	5
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale	Aug 15, 2022	GPULanguage Modelling	CodeCode Available	5
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second	Jul 5, 2022	AutoMLBayesian Inference	CodeCode Available	5
Multi-head Temporal Latent Attention	May 19, 2025	GPUspeech-recognition	CodeCode Available	4
Accelerating Visual-Policy Learning through Parallel Differentiable Simulation	May 15, 2025	GPU	CodeCode Available	4
OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit	May 12, 2025	GPUPrivacy Preserving	CodeCode Available	4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints	Apr 15, 2025	GPUInference Optimization	CodeCode Available	4
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float	Apr 15, 2025	CPUGPU	CodeCode Available	4
LettuceDetect: A Hallucination Detection Framework for RAG Applications	Feb 24, 2025	8kGPU	CodeCode Available	4
Building reliable sim driving agents by scaling self-play	Feb 20, 2025	Autonomous VehiclesBenchmarking	CodeCode Available	4
KernelBench: Can LLMs Write Efficient GPU Kernels?	Feb 14, 2025	GPU	CodeCode Available	4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token	Jan 7, 2025	GPUVisual Question Answering (VQA)	CodeCode Available	4
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization	Dec 30, 2024	Audio GenerationGPU	CodeCode Available	4
SocialED: A Python Library for Social Event Detection	Dec 18, 2024	CPUEvent Detection	CodeCode Available	4
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models	Nov 7, 2024	GPUQuantization	CodeCode Available	4
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads	Oct 14, 2024	GPUQuantization	CodeCode Available	4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts	Oct 9, 2024	GPUMixture-of-Experts	CodeCode Available	4
Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding	Sep 22, 2024	Anomaly DetectionGPU	CodeCode Available	4
EmbodiedSAM: Online Segment Any 3D Thing in Real Time	Aug 21, 2024	3D Instance SegmentationGPU	CodeCode Available	4
Deep Patch Visual SLAM	Aug 3, 2024	GPUVisual Odometry	CodeCode Available	4
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS	Aug 2, 2024	GPUNavigate	CodeCode Available	4
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals	Jul 18, 2024	Experimental DesignGPU	CodeCode Available	4
fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence	Jul 1, 2024	GPUPoint cloud reconstruction	CodeCode Available	4
On Scaling Up 3D Gaussian Splatting Training	Jun 26, 2024	3DGS3D Reconstruction	CodeCode Available	4
Mamba YOLO: A Simple Baseline for Object Detection with State Space Model	Jun 9, 2024	GPUMamba	CodeCode Available	4
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation	Jun 4, 2024	Face SwappingGPU	CodeCode Available	4
Looking Backward: Streaming Video-to-Video Translation with Feature Banks	May 24, 2024	GPUTranslation	CodeCode Available	4
Vidur: A Large-Scale Simulation Framework For LLM Inference	May 8, 2024	CPUGPU	CodeCode Available	4
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving	May 7, 2024	GPULanguage Modelling	CodeCode Available	4
Mamba-FETrack: Frame-Event Tracking via State Space Model	Apr 28, 2024	GPUMamba	CodeCode Available	4
JetMoE: Reaching Llama2 Performance with 0.1M Dollars	Apr 11, 2024	GPUMixture-of-Experts	CodeCode Available	4

Show:10 25 50

← PrevPage 2 of 113Next →

No leaderboard results yet.