SOTAVerified

CPU

Papers

Showing 150 of 2231 papers

TitleStatusHype
Hear Your Code Fail, Voice-Assisted Debugging for Python0
3C-FBI: A Combinatorial method using Convolutions for Circle Fitting in Blurry Images0
Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive0
MathOptAI.jl: Embed trained machine learning predictors into JuMP modelsCode2
LoRA Fine-Tuning Without GPUs: A CPU-Efficient Meta-Generation Framework for LLMs0
AUTOMATIC ROOM LIGHT CONTROLLER MANAGEMENT SYSTEM.0
Causal-Aware Intelligent QoE Optimization for VR Interaction with Adaptive Keyframe Extraction0
MNN-AECS: Energy Optimization for LLM Decoding on Mobile Devices via Adaptive Core Selection0
Variational Bayesian Channel Estimation and Data Detection for Cell-Free Massive MIMO with Low-Resolution Quantized Fronthaul Links0
LIGHTHOUSE: Fast and precise distance to shoreline calculations from anywhere on earthCode1
ConsumerBench: Benchmarking Generative AI Applications on End-User DevicesCode1
Speeding up Local Optimization in Vehicle Routing with Tensor-based GPU Acceleration0
Wavelet-based Global Orientation and Surface Reconstruction for Point Clouds0
Distributed Activity Detection for Cell-Free Hybrid Near-Far Field Communications0
Parallel Branch Model Predictive Control on GPUs0
Versatile and Fast Location-Based Private Information Retrieval with Fully Homomorphic Encryption over the TorusCode0
SecONNds: Secure Outsourced Neural Network Inference on ImageNetCode0
HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration0
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding0
MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices0
GPU-accelerated Modeling of Biological Regulatory Networks0
Plug-and-Play Linear Attention for Pre-trained Image and Video Restoration ModelsCode0
Implementing Keyword Spotting on the MCUX947 Microcontroller with Integrated NPU0
JavelinGuard: Low-Cost Transformer Architectures for LLM Security0
Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect Storage0
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation Architectures0
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts0
PointODE: Lightweight Point Cloud Learning with Neural Ordinary Differential Equations on Edge0
Fast Feature Matching of UAV Images via Matrix Band Reduction-based GPU Data Schedule0
Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs0
CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs0
TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache OptimizationCode1
TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis0
FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization0
Why Not Replace? Sustaining Long-Term Visual Localization via Handcrafted-Learned Feature Collaboration on CPUCode1
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-DesignCode2
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert ModelsCode0
KernelOracle: Predicting the Linux Scheduler's Next Move with Deep LearningCode0
Harnessing Large Language Models Locally: Empirical Results and Implications for AI PCCode0
Machine Learning for Consistency Violation Faults Analysis0
FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference0
ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates0
MPRM: A Markov Path-based Rule Miner for Efficient and Interpretable Knowledge Graph Reasoning0
A Heuristic Algorithm Based on Beam Search and Iterated Local Search for the Maritime Inventory Routing Problem0
Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets0
From Embeddings to Accuracy: Comparing Foundation Models for Radiographic Classification0
SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained DevicesCode1
Lossless Compression for LLM Tensor Incremental Snapshots0
Single-shot prediction of parametric partial differential equations0
Show:102550
← PrevPage 1 of 45Next →

No leaderboard results yet.