SOTAVerified

GPU

Papers

Showing 21012150 of 5629 papers

TitleStatusHype
Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set0
Fast Sampling of Cosmological Initial Conditions with Gaussian Neural Posterior Estimation0
Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCbCode0
Ilargi: a GPU Compatible Factorized ML Model Training Framework0
Brief analysis of DeepSeek R1 and it's implications for Generative AI0
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization0
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models0
Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity0
ModServe: Scalable and Resource-Efficient Large Multimodal Model Serving0
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference0
Recursive generalized type-2 fuzzy radial basis function neural networks for joint position estimation and adaptive EMG-based impedance control of lower limb exoskeletonsCode0
Longer Attention Span: Increasing Transformer Context Length with Sparse Graph Processing TechniquesCode0
Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models0
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected0
LLM-based Affective Text Generation Quality Based on Different Quantization Values0
TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs0
Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations0
adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling AnalysisCode0
CrowdSplat: Exploring Gaussian Splatting For Crowd RenderingCode0
Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection0
One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning0
PISCO: Pretty Simple Compression for Retrieval-Augmented Generation0
Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference0
Towards Scalable Topological Regularizers0
3DGS^2: Near Second-order Converging 3D Gaussian Splatting0
GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language ModelsCode0
HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation0
Learning Versatile Optimizers on a Compute DietCode0
Irrational Complex Rotations Empower Low-bit Optimizers0
TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking0
Pushing the Limits of BFP on Narrow Precision LLM Inference0
Revisiting Ensemble Methods for Stock Trading and Crypto Trading Tasks at ACM ICAIF FinRL Contest 2023-20240
MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow0
No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling0
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models0
Good things come in small packages: Should we build AI clusters with Lite-GPUs?0
FASP: Fast and Accurate Structured Pruning of Large Language Models0
PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPUCode0
The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution0
Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement0
Resource-Constrained Federated Continual Learning: What Does Matter?0
GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping0
Hierarchical Autoscaling for Large Language Model Serving with Chiron0
Physics-Informed Latent Neural Operator for Real-time Predictions of Complex Physical Systems0
Towards Lightweight Time Series Forecasting: a Patch-wise Transformer with Weak Data Enriching0
Keras Sig: Efficient Path Signature Computation on GPU in Keras 30
Ultra Memory-Efficient On-FPGA Training of Transformers via Tensor-Compressed Optimization0
EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models0
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition0
Towards Early Prediction of Self-Supervised Speech Model Performance0
Show:102550
← PrevPage 43 of 113Next →

No leaderboard results yet.