SOTAVerified

GPU

Papers

Showing 21012125 of 5629 papers

TitleStatusHype
Fast Sampling of Cosmological Initial Conditions with Gaussian Neural Posterior Estimation0
Robust Autonomy Emerges from Self-Play0
Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCbCode0
Brief analysis of DeepSeek R1 and it's implications for Generative AI0
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization0
Ilargi: a GPU Compatible Factorized ML Model Training Framework0
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models0
Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity0
ModServe: Scalable and Resource-Efficient Large Multimodal Model Serving0
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference0
Recursive generalized type-2 fuzzy radial basis function neural networks for joint position estimation and adaptive EMG-based impedance control of lower limb exoskeletonsCode0
TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs0
Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models0
Longer Attention Span: Increasing Transformer Context Length with Sparse Graph Processing TechniquesCode0
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected0
LLM-based Affective Text Generation Quality Based on Different Quantization Values0
adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling AnalysisCode0
Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations0
CrowdSplat: Exploring Gaussian Splatting For Crowd RenderingCode0
Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection0
One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning0
Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference0
PISCO: Pretty Simple Compression for Retrieval-Augmented Generation0
Towards Scalable Topological Regularizers0
GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language ModelsCode0
Show:102550
← PrevPage 85 of 226Next →

No leaderboard results yet.