SOTAVerified

GPU

Papers

Showing 301325 of 5629 papers

TitleStatusHype
Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures0
Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models0
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length FloatCode4
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators0
PatrolVision: Automated License Plate Recognition in the wild0
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
CAT: A Conditional Adaptation Tailor for Efficient and Effective Instance-Specific Pansharpening on Real-World Data0
Anchors no more: Using peculiar velocities to constrain H_0 and the primordial Universe without calibratorsCode0
Frozen Layers: Memory-efficient Many-fidelity Hyperparameter Optimization0
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large ImagesCode2
aweSOM: a CPU/GPU-accelerated Self-organizing Map and Statistically Combined Ensemble Framework for Machine-learning Clustering Analysis0
Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing0
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints0
Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models0
Spectral Normalization for Lipschitz-Constrained Policies on Learning Humanoid Locomotion0
SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting0
TensorNEAT: A GPU-accelerated Library for NeuroEvolution of Augmenting TopologiesCode3
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model0
MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI ApplicationsCode3
TorchFX: A modern approach to Audio DSP with PyTorch and GPU accelerationCode2
EO-VLM: VLM-Guided Energy Overload Attacks on Vision Models0
Search-contempt: a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency0
PoGO: A Scalable Proof of Useful Work via Quantized Gradient Descent and Merkle Proofs0
DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction0
Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference ServingCode1
Show:102550
← PrevPage 13 of 226Next →

No leaderboard results yet.