SOTAVerified

GPU

Papers

Showing 301350 of 5629 papers

TitleStatusHype
Scaling Down Text Encoders of Text-to-Image Diffusion ModelsCode2
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV CacheCode2
Splat-LOAM: Gaussian Splatting LiDAR Odometry and MappingCode2
DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image UnderstandingCode2
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM KernelsCode2
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language ModelingCode2
RENO: Real-Time Neural Compression for 3D LiDAR Point CloudsCode2
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space ModelsCode2
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference OptimizationCode2
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention DistillationCode2
Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian ProcessCode2
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal ModelsCode2
Streaming Video Question-Answering with In-context Video KV-Cache RetrievalCode2
KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio GenerationCode2
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton OperatorsCode2
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language ModelsCode2
HeadInfer: Memory-Efficient LLM Inference by Head-wise OffloadingCode2
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear DistillationCode2
Saving 77% of the Parameters in Large Language Models Technical ReportCode2
QuEST: Stable Training of LLMs with 1-Bit Weights and ActivationsCode2
WaferLLM: Large Language Model Inference at Wafer ScaleCode2
An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep NetworksCode2
Recurrent Diffusion for Large-Scale Parameter GenerationCode2
A User's Guide to KSig: GPU-Accelerated Computation of the Signature KernelCode2
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-ResolutionCode2
TakuNet: an Energy-Efficient CNN for Real-Time Inference on Embedded UAV systems in Emergency Response ScenariosCode2
MBQ: Modality-Balanced Quantization for Large Vision-Language ModelsCode2
ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecastingCode2
FlashRNN: Optimizing Traditional RNNs on Modern HardwareCode2
ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement TasksCode2
Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene ReconstructionCode2
Playable Game GenerationCode2
Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context SparsificationCode2
Real-Time Metric-Semantic Mapping for Autonomous Navigation in Outdoor EnvironmentsCode2
Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operatorsCode2
Collaborative Decoding Makes Visual Auto-Regressive Modeling EfficientCode2
GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous DrivingCode2
AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and state space modelsCode2
Brain Tumour Removing and Missing Modality Generation using 3D WDMCode2
Real-Time Polygonal Semantic Mapping for Humanoid Robot Stair ClimbingCode2
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot ExecutionCode2
RAGViz: Diagnose and Visualize Retrieval-Augmented GenerationCode2
The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical DomainsCode2
Very fast Bayesian Additive Regression Trees on GPUCode2
$100K or 100 Days: Trade-offs when Pre-Training with Academic ResourcesCode2
LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor SearchCode2
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One StepCode2
nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric VisionCode2
GS^3: Efficient Relighting with Triple Gaussian SplattingCode2
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token ReductionCode2
Show:102550
← PrevPage 7 of 113Next →

No leaderboard results yet.