SOTAVerified

GPU

Papers

Showing 226250 of 5629 papers

TitleStatusHype
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligenceCode3
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video UnderstandingCode3
A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation ModelsCode3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-IIICode3
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language ModelsCode3
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AICode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid ArchitectureCode3
Dataset Distillation with Neural Characteristic Function: A Minmax PerspectiveCode3
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token SequencesCode3
GraphNeuralNetworks.jl: Deep Learning on Graphs with JuliaCode3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language ModelsCode3
CtrLoRA: An Extensible and Efficient Framework for Controllable Image GenerationCode3
LinFusion: 1 GPU, 1 Minute, 16K ImageCode3
HadaCore: Tensor Core Accelerated Hadamard Transform KernelCode3
Cramming: Training a Language Model on a Single GPU in One DayCode3
ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated CharactersCode3
High-Speed Stereo Visual SLAM for Low-Powered Computing DevicesCode3
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement LearningCode3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache QuantizationCode3
How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks?Code3
Transformers Can Do Arithmetic with the Right EmbeddingsCode3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
Show:102550
← PrevPage 10 of 226Next →

No leaderboard results yet.