SOTAVerified

GPU

Papers

Showing 451500 of 5629 papers

TitleStatusHype
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention DistillationCode2
Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows0
Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning0
Training and Inference Efficiency of Encoder-Decoder Speech Models0
Wanda++: Pruning Large Language Models via Regional GradientsCode0
Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach0
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining0
Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian ProcessCode2
Toward Lightweight and Fast Decoders for Diffusion Models in Image and Video GenerationCode1
Eventprop training for efficient neuromorphic applications0
JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba0
Partial Convolution Meets Visual Attention0
Memory and Bandwidth are All You Need for Fully Sharded Data Parallel0
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal ModelsCode2
DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian SplattingCode1
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory0
Open-source framework for detecting bias and overfitting for large pathology imagesCode0
KurTail : Kurtosis-based LLM Quantization0
LiteGS: A High-Performance Modular Framework for Gaussian Splatting TrainingCode3
OceanSim: A GPU-Accelerated Underwater Robot Perception Simulation Framework0
A Reconfigurable Stream-Based FPGA Accelerator for Bayesian Confidence Propagation Neural Networks0
Nature-Inspired Population-Based Evolution of Large Language ModelsCode1
Category-level Meta-learned NeRF Priors for Efficient Object Mapping0
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence DraftingCode1
Cauchy Random Features for Operator Learning in Sobolev SpaceCode0
Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual TrackingCode1
Floorplan-SLAM: A Real-Time, High-Accuracy, and Long-Term Multi-Session Point-Plane SLAM for Efficient Floorplan Reconstruction0
Streaming Video Question-Answering with In-context Video KV-Cache RetrievalCode2
Timing-Driven Global Placement by Efficient Critical Path Extraction0
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval0
Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content0
Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platform0
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks0
Oscillation-Reduced MXFP4 Training for Vision TransformersCode1
S4ConvD: Adaptive Scaling and Frequency Adjustment for Energy-Efficient Sensor Networks in Smart BuildingsCode0
AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs0
SEKI: Self-Evolution and Knowledge Inspiration based Neural Architecture Search via Large Language Models0
LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis0
Scalable Signature Kernel Computations for Long Time Series via Local Neumann Series ExpansionsCode1
FPGA-Accelerated SpeckleNN with SNL for Real-time X-ray Single-Particle Imaging0
Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image EnhancementCode0
Accurate and Scalable Graph Neural Networks via Message InvarianceCode0
WaveGAS: Waveform Relaxation for Scaling Graph Neural Networks0
QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects0
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-ExpertsCode5
Mechanistic PDE Networks for Discovery of Governing Equations0
Accelerated Training on Low-Power Edge Devices0
Software implemented fault diagnosis of natural gas pumping unit based on feedforward neural network0
The Power of Graph Signal Processing for Chip Placement Acceleration0
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance0
Show:102550
← PrevPage 10 of 113Next →

No leaderboard results yet.