SOTAVerified

CPU

Papers

Showing 126150 of 2231 papers

TitleStatusHype
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory0
Evaluation of adaptive sampling methods in scenario generation for virtual safety impact assessment of pre-crash safety systems0
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence DraftingCode1
Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual TrackingCode1
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks0
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval0
AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs0
LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis0
Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image EnhancementCode0
LightFC-X: Lightweight Convolutional Tracker for RGB-X TrackingCode1
SparseTransX: Efficient Training of Translation-Based Knowledge Graph Embeddings Using Sparse Matrix OperationsCode0
A Universal Framework for Compressing Embeddings in CTR PredictionCode0
Distributed U-net model and Image Segmentation for Lung Cancer Detection0
Dynamic Low-Rank Sparse Adaptation for Large Language ModelsCode1
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective0
Safe Beyond the Horizon: Efficient Sampling-based MPC with Neural Control Barrier Functions0
Object-Pose Estimation With Neural Population Codes0
On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation0
A^2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization0
HeadInfer: Memory-Efficient LLM Inference by Head-wise OffloadingCode2
Robust 6DoF Pose Tracking Considering Contour and Interior Correspondence Uncertainty for AR Assembly Guidance0
Representation Learning on Out of Distribution in Tabular Data0
Habitizing Diffusion Planning for Efficient and Effective Decision MakingCode1
Weighted-Sum Energy Efficiency Maximization in User-Centric Uplink Cell-Free Massive MIMO0
DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis0
Show:102550
← PrevPage 6 of 90Next →

No leaderboard results yet.