SOTAVerified

GPU

Papers

Showing 576600 of 5629 papers

TitleStatusHype
QuEST: Stable Training of LLMs with 1-Bit Weights and ActivationsCode2
WaferLLM: Large Language Model Inference at Wafer ScaleCode2
InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers0
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache0
Kozax: Flexible and Scalable Genetic Programming in JAXCode1
SyMANTIC: An Efficient Symbolic Regression Method for Interpretable and Parsimonious Model Discovery in Science and BeyondCode1
Fast Sampling of Cosmological Initial Conditions with Gaussian Neural Posterior Estimation0
Robust Autonomy Emerges from Self-Play0
Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set0
Transolver++: An Accurate Neural Solver for PDEs on Million-Scale GeometriesCode3
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization0
Brief analysis of DeepSeek R1 and it's implications for Generative AI0
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models0
Ilargi: a GPU Compatible Factorized ML Model Training Framework0
Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCbCode0
Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity0
ModServe: Scalable and Resource-Efficient Large Multimodal Model Serving0
Recursive generalized type-2 fuzzy radial basis function neural networks for joint position estimation and adaptive EMG-based impedance control of lower limb exoskeletonsCode0
M+: Extending MemoryLLM with Scalable Long-Term MemoryCode3
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference0
Work-Efficient Parallel Non-Maximum Suppression KernelsCode1
Longer Attention Span: Increasing Transformer Context Length with Sparse Graph Processing TechniquesCode0
TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs0
LLM-based Affective Text Generation Quality Based on Different Quantization Values0
Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models0
Show:102550
← PrevPage 24 of 226Next →

No leaderboard results yet.