SOTAVerified

GPU

Papers

Showing 251300 of 5629 papers

TitleStatusHype
Sparfels: Fast Reconstruction from Sparse Unposed Imagery0
Feature Optimization for Time Series Forecasting via Novel Randomized Uphill Climbing0
Phantora: Live GPU Cluster Simulation for Machine Learning System Performance Estimation0
Aggregating empirical evidence from data strategy studies: a case on model quantization0
Efficient On-Chip Implementation of 4D Radar-Based 3D Object Detection on Hailo-8L0
GPU Performance Portability needs AutotuningCode2
Sionna RT: Technical Report0
Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning0
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language ModelsCode0
STCOcc: Sparse Spatial-Temporal Cascade Renovation for 3D Occupancy and Scene Flow PredictionCode2
Mesh-Learner: Texturing Mesh with Spherical HarmonicsCode1
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language0
semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage0
Taming the Titans: A Survey of Efficient LLM Inference ServingCode1
FlashOverlap: A Lightweight Design for Efficiently Overlapping Communication and Computation0
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication0
NSFlow: An End-to-End FPGA Framework with Scalable Dataflow Architecture for Neuro-Symbolic AI0
Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion ColliderCode0
GPU accelerated program synthesis: Enumerate semantics, not syntax!0
The Big Send-off: High Performance Collectives on GPU-based Supercomputers0
CaRL: Learning Scalable Planning Policies with Simple RewardsCode2
L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference0
Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification0
Fried Parameter Estimation from Single Wavefront Sensor Image with Artificial Neural Networks0
Democracy of AI Numerical Weather Models: An Example of Global Forecasting with FourCastNetv2 Made by a University Research Lab Using GPU0
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse AttentionCode5
Hexcute: A Tile-based Programming Language with Automatic Layout and Task-Mapping Synthesis0
Scalable APT Malware Classification via Parallel Feature Extraction and GPU-Accelerated Learning0
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained SettingsCode0
Splitwiser: Efficient LM inference with constrained resourcesCode0
LithOS: An Operating System for Efficient Machine Learning on GPUs0
Distribution-aware Dataset Distillation for Efficient Image Restoration0
Robust and Real-time Surface Normal Estimation from Stereo Disparities using Affine Transformations0
Beyond Terabit/s Integrated Neuromorphic Photonic Processor for DSP-Free Optical Interconnects0
SG-Reg: Generalizable and Efficient Scene Graph RegistrationCode2
AlphaZero-Edu: Making AlphaZero Accessible to EveryoneCode0
HPU: High-Bandwidth Processing Unit for Scalable, Cost-effective LLM Inference via GPU Co-processing0
Quantum Walks-Based Adaptive Distribution Generation with Efficient CUDA-Q Acceleration0
Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory PredictionCode1
NNTile: a machine learning framework capable of training extremely large GPT language models on a single node0
Mask Image WatermarkingCode1
Second-order Optimization of Gaussian Splats with Importance Sampling0
ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior0
Tilus: A Virtual Machine for Arbitrary Low-Precision GPGPU Computation in LLM Serving0
Data-efficient LLM Fine-tuning for Code GenerationCode1
Taccel: Scaling Up Vision-based Tactile Robotics via High-performance GPU SimulationCode2
BitNet b1.58 2B4T Technical Report0
Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification0
MOM: Memory-Efficient Offloaded Mini-Sequence Inference for Long Context Language Models0
Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures0
Show:102550
← PrevPage 6 of 113Next →

No leaderboard results yet.