SOTAVerified

GPU

Papers

Showing 401450 of 5629 papers

TitleStatusHype
LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence ParallelismCode2
Scaling Laws for Data Filtering -- Data Curation cannot be Compute AgnosticCode2
OmniGS: Fast Radiance Field Reconstruction using Omnidirectional Gaussian SplattingCode2
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular StructuresCode2
Accelerating Transformer Pre-training with 2:4 SparsityCode2
Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image RestorationCode2
Efficient Modulation for Vision NetworksCode2
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstructionCode2
Fully-fused Multi-Layer Perceptrons on Intel Data Center GPUsCode2
Efficient Video Object Segmentation via Modulated Cross-Attention MemoryCode2
Invertible Diffusion Models for Compressed SensingCode2
YOLOv5-6D: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging GeometriesCode2
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluationCode2
Characterization of Large Language Model Development in the DatacenterCode2
Scalable Spatiotemporal Prediction with Bayesian Neural FieldsCode2
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real SystemCode2
Tracking Meets LoRA: Faster Training, Larger Model, Stronger PerformanceCode2
RFWave: Multi-band Rectified Flow for Audio Waveform ReconstructionCode2
Birbal: An efficient 7B instruct-model fine-tuned with curated datasetsCode2
MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target DetectionCode2
Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud AnalysisCode2
WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image SynthesisCode2
DEYO: DETR with YOLO for End-to-End Object DetectionCode2
Fast Adversarial Attacks on Language Models In One GPU MinuteCode2
Me LLaMA: Foundation Large Language Models for Medical ApplicationsCode2
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inferenceCode2
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model InferenceCode2
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent SpaceCode2
4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic ScenesCode2
Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote SensingCode2
SHViT: Single-Head Vision Transformer with Memory Efficient Macro DesignCode2
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space ModelCode2
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase PredictionCode2
Low-resource finetuning of foundation models beats state-of-the-art in histopathologyCode2
WidthFormer: Toward Efficient Transformer-based BEV View TransformationCode2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
CoMoSVC: Consistency Model-based Singing Voice ConversionCode2
MosaicBERT: A Bidirectional Encoder Optimized for Fast PretrainingCode2
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View SynthesisCode2
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model InferenceCode2
A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS LibraryCode2
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAXCode2
mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUsCode2
CoLLiE: Collaborative Training of Large Language Models in an Efficient WayCode2
XLB: A differentiable massively parallel lattice Boltzmann library in PythonCode2
Learning to Fly in SecondsCode2
Using Human Feedback to Fine-tune Diffusion Models without Any Reward ModelCode2
JaxMARL: Multi-Agent RL Environments and Algorithms in JAXCode2
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers FasterCode2
Black-Box Prompt Optimization: Aligning Large Language Models without Model TrainingCode2
Show:102550
← PrevPage 9 of 113Next →

No leaderboard results yet.