SOTAVerified

CPU

Papers

Showing 201250 of 2231 papers

TitleStatusHype
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert InferenceCode1
Towards Lightweight Data Integration using Multi-workflow Provenance and Data ObservabilityCode1
When Monte-Carlo Dropout Meets Multi-Exit: Optimizing Bayesian Neural Networks on FPGACode1
High-performance Data Management for Whole Slide Image Analysis in Digital PathologyCode1
QUANT: A Minimalist Interval Method for Time Series ClassificationCode1
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA AccelerationCode1
Mitigating Communications Threats in Decentralized Federated Learning through Moving Target DefenseCode1
Implementation of a perception system for autonomous vehicles using a detection-segmentation network in SoC FPGACode1
Fast model inference and training on-board of SatellitesCode1
QIGen: Generating Efficient Kernels for Quantized Inference on Large Language ModelsCode1
An open-source deep learning algorithm for efficient and fully-automatic analysis of the choroid in optical coherence tomographyCode1
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand CoresCode1
Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage AccessesCode1
Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer InferenceCode1
Implementing contextual biasing in GPU decoder for online ASRCode1
Dynamic Perceiver for Efficient Visual RecognitionCode1
Co-design Hardware and Algorithm for Vector SearchCode1
ExoMDN: Rapid characterization of exoplanet interior structures with Mixture Density NetworksCode1
Audio Tagging on an Embedded Hardware PlatformCode1
EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and RepresentationCode1
The Information Retrieval Experiment PlatformCode1
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-ExpertsCode1
Search-Based Regular Expression Inference on a GPUCode1
EfficientSpeech: An On-Device Text to Speech ModelCode1
Fast and Attributed Change Detection on Dynamic Graphs with Density of StatesCode1
Dynamic Sparse Training with Structured SparsityCode1
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU ArchitecturesCode1
FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval systemCode1
Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data ValueCode1
DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network InferenceCode1
InterFormer: Real-time Interactive Image SegmentationCode1
Real-Time Dense 3D Mapping of Underwater EnvironmentsCode1
TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory SystemsCode1
GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and OptimizationCode1
FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional RegressionsCode1
Practically Solving LPN in High Noise Regimes Faster Using Neural NetworksCode1
Fourier-MIONet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestrationCode1
Efficient subtyping of ovarian cancer histopathology whole slide images using active sampling in multiple instance learningCode1
GPU-based Private Information Retrieval for On-Device Machine Learning InferenceCode1
FemtoDet: An Object Detection Baseline for Energy Versus Performance TradeoffsCode1
Distributed Deep Neural-Network-Based Middleware for Cyber-Attacks Detection in Smart IoT Ecosystem: A Novel Framework and Performance Evaluation ApproachCode1
Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted MicroservicesCode1
GPU-accelerated Guided Source Separation for Meeting TranscriptionCode1
A Practical Stereo Depth System for Smart GlassesCode1
ParticleGrid: Enabling Deep Learning using 3D Representation of MaterialsCode1
WindowSHAP: An Efficient Framework for Explaining Time-series Classifiers based on Shapley ValuesCode1
TLP: A Deep Learning-based Cost Model for Tensor Program TuningCode1
SLOPT: Bandit Optimization Framework for Mutation-Based FuzzingCode1
Frequency Cam: Imaging Periodic Signals in Real-TimeCode1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
Show:102550
← PrevPage 5 of 45Next →

No leaderboard results yet.