SOTAVerified

GPU

Papers

Showing 651700 of 5629 papers

TitleStatusHype
Kozax: Flexible and Scalable Genetic Programming in JAXCode1
Work-Efficient Parallel Non-Maximum Suppression KernelsCode1
Cache Me If You Must: Adaptive Key-Value Quantization for Large Language ModelsCode1
Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a PosteriorCode1
Return of the Encoder: Maximizing Parameter Efficiency for SLMsCode1
CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement LearningCode1
Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlappingCode1
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action DetectionCode1
LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA ImplementationsCode1
RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging RadarCode1
Mamba4D: Efficient 4D Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space ModelsCode1
Lightweight G-YOLOv11: Advancing Efficient Fracture Detection in Pediatric Wrist X-raysCode1
GIMS: Image Matching System Based on Adaptive Graph Construction and Graph Neural NetworkCode1
Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain TestingCode1
Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry LocalityCode1
Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box SettingsCode1
NITRO: LLM Inference on Intel Laptop NPUsCode1
Light-T2M: A Lightweight and Fast Model for Text-to-motion GenerationCode1
Real-time Identity Defenses against Malicious Personalization of Diffusion ModelsCode1
EOV-Seg: Efficient Open-Vocabulary Panoptic SegmentationCode1
MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One DayCode1
Transformers Can Navigate Mazes With Multi-Step PredictionCode1
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio DecayCode1
Beyond [cls]: Exploring the true potential of Masked Image Modeling representationsCode1
VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion ModelsCode1
Act Now: A Novel Online Forecasting Framework for Large-Scale Streaming DataCode1
Global Tensor Motion PlanningCode1
ADAF: An Artificial Intelligence Data Assimilation Framework for Weather ForecastingCode1
Quantization without TearsCode1
ITER: Iterative Transformer-based Entity Recognition and Relation ExtractionCode1
GPU-Accelerated Inverse Lithography Towards High Quality Curvy Mask GenerationCode1
Diffusion Sampling Correction via Approximately 10 ParametersCode1
HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion SegmentationCode1
LiVOS: Light Video Object Segmentation with Gated Linear MatchingCode1
Fast and Memory-Efficient Video Diffusion Using Streamlined InferenceCode1
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge DistillationCode1
LOGO -- Long cOntext aliGnment via efficient preference OptimizationCode1
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache SharingCode1
syren-new: Precise formulae for the linear and nonlinear matter power spectra with massive neutrinos and dynamical dark energyCode1
xPerT: Extended Persistence TransformerCode1
EP-SAM: Weakly Supervised Histopathology Segmentation via Enhanced Prompt with Segment AnythingCode1
SPA: 3D Spatial-Awareness Enables Effective Embodied RepresentationCode1
Neural Reasoning Networks: Efficient Interpretable Neural Networks With Automatic Textual ExplanationsCode1
PostEdit: Posterior Sampling for Efficient Zero-Shot Image EditingCode1
Large Language Model Inference Acceleration: A Comprehensive Hardware PerspectiveCode1
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesCode1
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade DevicesCode1
TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model DiscoveryCode1
STGformer: Efficient Spatiotemporal Graph Transformer for Traffic ForecastingCode1
Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language ModelsCode1
Show:102550
← PrevPage 14 of 113Next →

No leaderboard results yet.