SOTAVerified

CPU

Papers

Showing 101150 of 2231 papers

TitleStatusHype
Robust DNN Partitioning and Resource Allocation Under Uncertain Inference Time0
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via TensorizationCode7
PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch0
Adaptive Machine Learning for Resource-Constrained EnvironmentsCode0
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data ConstructionCode9
V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms0
SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs0
Design and Implementation of an FPGA-Based Hardware Accelerator for TransformerCode1
BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and SystemsCode0
Audio Compression using Periodic Gabor with Biorthogonal Exchange: Implementation Using the Zak Transform0
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU MemoryCode3
Robust Learning-Based Sparse Recovery for Device Activity Detection in Grant-Free Random Access Cell-Free Massive MIMO: Enhancing Resilience to Impairments0
Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge0
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices0
Efficient Neural Clause-Selection Reinforcement0
HGO-YOLO: Advancing Anomaly Behavior Detection with Hierarchical Features and Lightweight Optimized Detection0
Coordinated Energy-Trajectory Economic Model Predictive Control for Autonomous Surface Vehicles under Disturbances0
Spillover effects between climate policy uncertainty, energy markets, and food markets: A time-frequency analysis0
The impact of external uncertainties on the extreme return connectedness between food, fossil energy, and clean energy markets0
LapSum -- One Method to Differentiate Them All: Ranking, Sorting and Top-k Selection0
Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows0
Deterministic Global Optimization of the Acquisition Function in Bayesian Optimization: To Do or Not To Do?0
Partial Convolution Meets Visual Attention0
Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum SystemsCode0
DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian SplattingCode1
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory0
Evaluation of adaptive sampling methods in scenario generation for virtual safety impact assessment of pre-crash safety systems0
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence DraftingCode1
Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual TrackingCode1
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks0
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval0
AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMsCode0
LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis0
Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image EnhancementCode0
LightFC-X: Lightweight Convolutional Tracker for RGB-X TrackingCode1
SparseTransX: Efficient Training of Translation-Based Knowledge Graph Embeddings Using Sparse Matrix OperationsCode0
A Universal Framework for Compressing Embeddings in CTR PredictionCode0
Distributed U-net model and Image Segmentation for Lung Cancer Detection0
Dynamic Low-Rank Sparse Adaptation for Large Language ModelsCode1
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective0
Safe Beyond the Horizon: Efficient Sampling-based MPC with Neural Control Barrier Functions0
Object-Pose Estimation With Neural Population Codes0
On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation0
A^2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization0
HeadInfer: Memory-Efficient LLM Inference by Head-wise OffloadingCode2
Robust 6DoF Pose Tracking Considering Contour and Interior Correspondence Uncertainty for AR Assembly Guidance0
Representation Learning on Out of Distribution in Tabular Data0
Habitizing Diffusion Planning for Efficient and Effective Decision MakingCode1
Weighted-Sum Energy Efficiency Maximization in User-Centric Uplink Cell-Free Massive MIMO0
DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis0
Show:102550
← PrevPage 3 of 45Next →

No leaderboard results yet.