| AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation | Jan 19, 2023 | GPU | CodeCode Available | 1 | 5 |
| CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Jan 14, 2025 | Deep Reinforcement LearningGPU | CodeCode Available | 1 | 5 |
| Effective Batching for Recurrent Neural Network Grammars | May 31, 2021 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| LightViT: Towards Light-Weight Convolution-Free Vision Transformers | Jul 12, 2022 | GPUimage-classification | CodeCode Available | 1 | 5 |
| LiVOS: Light Video Object Segmentation with Gated Linear Matching | Nov 5, 2024 | GPUSemantic Segmentation | CodeCode Available | 1 | 5 |
| LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field | Sep 26, 2024 | GPUNeRF | CodeCode Available | 1 | 5 |
| A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays | Apr 5, 2021 | GPU | CodeCode Available | 1 | 5 |
| Efficient Lifelong Model Evaluation in an Era of Rapid Progress | Feb 29, 2024 | BenchmarkingGPU | CodeCode Available | 1 | 5 |
| Easy and Efficient Transformer : Scalable Inference Solution For large NLP model | Apr 26, 2021 | DecoderGPU | CodeCode Available | 1 | 5 |
| EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs | Nov 30, 2021 | GPUImage Generation | CodeCode Available | 1 | 5 |
| Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts | May 30, 2023 | CPUGPU | CodeCode Available | 1 | 5 |
| Lettuce: PyTorch-based Lattice Boltzmann Framework | Jun 24, 2021 | BIG-bench Machine LearningGPU | CodeCode Available | 1 | 5 |
| AFDet: Anchor Free One Stage 3D Object Detection | Jun 23, 2020 | 3D Object DetectionAutonomous Driving | CodeCode Available | 1 | 5 |
| CryptGPU: Fast Privacy-Preserving Machine Learning on the GPU | Apr 22, 2021 | BIG-bench Machine LearningCPU | CodeCode Available | 1 | 5 |
| Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints during Training | Feb 25, 2022 | GPUNatural Questions | CodeCode Available | 1 | 5 |
| LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations | Jan 5, 2025 | GPU | CodeCode Available | 1 | 5 |
| EdgeNAT: Transformer for Efficient Edge Detection | Aug 20, 2024 | Edge DetectionGPU | CodeCode Available | 1 | 5 |
| Learning Tracking Representations via Dual-Branch Fully Transformer Networks | Dec 5, 2021 | GPUObject Tracking | CodeCode Available | 1 | 5 |
| A Fast Post-Training Pruning Framework for Transformers | Mar 29, 2022 | GPU | CodeCode Available | 1 | 5 |
| Dynamic Sparse Training with Structured Sparsity | May 3, 2023 | CPUGPU | CodeCode Available | 1 | 5 |
| Learning Universal Shape Dictionary for Realtime Instance Segmentation | Dec 2, 2020 | Explainable ModelsGPU | CodeCode Available | 1 | 5 |
| LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation | Jul 13, 2024 | GPU | CodeCode Available | 1 | 5 |
| LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference | May 18, 2025 | GPURetrieval | CodeCode Available | 1 | 5 |
| Dynamic Structure Pruning for Compressing CNNs | Mar 17, 2023 | GPU | CodeCode Available | 1 | 5 |
| Dynamic Low-Rank Sparse Adaptation for Large Language Models | Feb 20, 2025 | CPUGPU | CodeCode Available | 1 | 5 |
| Asynchronous Methods for Deep Reinforcement Learning | Feb 4, 2016 | Atari GamesCPU | CodeCode Available | 1 | 5 |
| Dynamic Mesh-Aware Radiance Fields | Sep 8, 2023 | GPUNeRF | CodeCode Available | 1 | 5 |
| Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms | May 8, 2021 | CPUGPU | CodeCode Available | 1 | 5 |
| Learning to Generate Wasserstein Barycenters | Feb 24, 2021 | GPU | CodeCode Available | 1 | 5 |
| Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning | Jun 4, 2024 | document understandingGPU | CodeCode Available | 1 | 5 |
| Edge and Identity Preserving Network for Face Super-Resolution | Aug 27, 2020 | GPUSuper-Resolution | CodeCode Available | 1 | 5 |
| Dynamic GPU Energy Optimization for Machine Learning Training Workloads | Jan 5, 2022 | BIG-bench Machine LearningGPU | CodeCode Available | 1 | 5 |
| Efficient Video Compression via Content-Adaptive Super-Resolution | Apr 6, 2021 | GPUSuper-Resolution | CodeCode Available | 1 | 5 |
| Dynamic Perceiver for Efficient Visual Recognition | Jun 20, 2023 | Action RecognitionClassification | CodeCode Available | 1 | 5 |
| Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation | Mar 1, 2021 | CPUFace Detection | CodeCode Available | 1 | 5 |
| Learning to Upsample by Learning to Sample | Aug 29, 2023 | Depth EstimationFeature Upsampling | CodeCode Available | 1 | 5 |
| Cross-Camera Convolutional Color Constancy | Nov 24, 2020 | Color ConstancyCPU | CodeCode Available | 1 | 5 |
| CUDA-Optimized real-time rendering of a Foveated Visual System | Dec 15, 2020 | FoveationGPU | CodeCode Available | 1 | 5 |
| Dynamic Pooling Improves Nanopore Base Calling Accuracy | May 16, 2021 | GPU | CodeCode Available | 1 | 5 |
| Cross-Batch Memory for Embedding Learning | Dec 14, 2019 | GPUImage Retrieval | CodeCode Available | 1 | 5 |
| Dyna-DM: Dynamic Object-aware Self-supervised Monocular Depth Maps | Jun 8, 2022 | Autonomous DrivingDepth Estimation | CodeCode Available | 1 | 5 |
| Efficient Classification of Very Large Images with Tiny Objects | Jun 4, 2021 | ClassificationGPU | CodeCode Available | 1 | 5 |
| Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices | Jan 17, 2024 | Dynamic neural networksGPU | CodeCode Available | 1 | 5 |
| EEEA-Net: An Early Exit Evolutionary Neural Architecture Search | Aug 13, 2021 | GPUImage Classification | CodeCode Available | 1 | 5 |
| Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge | Jan 24, 2023 | GPUimage-classification | CodeCode Available | 1 | 5 |
| CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method | Apr 23, 2024 | DenoisingGPU | CodeCode Available | 1 | 5 |
| CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs | Sep 19, 2024 | GPU | CodeCode Available | 1 | 5 |
| EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and Representation | Jun 9, 2023 | CPUGPU | CodeCode Available | 1 | 5 |
| DVIS: Decoupled Video Instance Segmentation Framework | Jun 6, 2023 | Autonomous DrivingGPU | CodeCode Available | 1 | 5 |
| Transformer Tracking | Mar 29, 2021 | GPUObject Tracking | CodeCode Available | 1 | 5 |