| Kozax: Flexible and Scalable Genetic Programming in JAX | Feb 5, 2025 | GPU | CodeCode Available | 1 |
| Work-Efficient Parallel Non-Maximum Suppression Kernels | Feb 1, 2025 | GPUobject-detection | CodeCode Available | 1 |
| Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models | Jan 31, 2025 | GPUQuantization | CodeCode Available | 1 |
| Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior | Jan 31, 2025 | GPU | CodeCode Available | 1 |
| Return of the Encoder: Maximizing Parameter Efficiency for SLMs | Jan 27, 2025 | Computational EfficiencyCPU | CodeCode Available | 1 |
| CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Jan 14, 2025 | Deep Reinforcement LearningGPU | CodeCode Available | 1 |
| Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping | Jan 11, 2025 | GPULarge Language Model | CodeCode Available | 1 |
| MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection | Jan 10, 2025 | Action DetectionGPU | CodeCode Available | 1 |
| LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA Implementations | Jan 5, 2025 | GPU | CodeCode Available | 1 |
| RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging Radar | Jan 4, 2025 | 3D Object Detection3D Object Detection (RoI) | CodeCode Available | 1 |
| Mamba4D: Efficient 4D Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models | Jan 1, 2025 | Action RecognitionAction Segmentation | CodeCode Available | 1 |
| Lightweight G-YOLOv11: Advancing Efficient Fracture Detection in Pediatric Wrist X-rays | Dec 31, 2024 | Fracture detectionGPU | CodeCode Available | 1 |
| GIMS: Image Matching System Based on Adaptive Graph Construction and Graph Neural Network | Dec 24, 2024 | GPUgraph construction | CodeCode Available | 1 |
| Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain Testing | Dec 23, 2024 | ArabicMMLUDialect Identification | CodeCode Available | 1 |
| Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality | Dec 21, 2024 | GPU | CodeCode Available | 1 |
| Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings | Dec 18, 2024 | GPU | CodeCode Available | 1 |
| NITRO: LLM Inference on Intel Laptop NPUs | Dec 15, 2024 | CPUGPU | CodeCode Available | 1 |
| Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation | Dec 15, 2024 | GPUMamba | CodeCode Available | 1 |
| Real-time Identity Defenses against Malicious Personalization of Diffusion Models | Dec 13, 2024 | CPUGPU | CodeCode Available | 1 |
| EOV-Seg: Efficient Open-Vocabulary Panoptic Segmentation | Dec 11, 2024 | DecoderGPU | CodeCode Available | 1 |
| MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One Day | Dec 8, 2024 | GPUImage Segmentation | CodeCode Available | 1 |
| Transformers Can Navigate Mazes With Multi-Step Prediction | Dec 6, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay | Dec 5, 2024 | DecoderGPU | CodeCode Available | 1 |
| Beyond [cls]: Exploring the true potential of Masked Image Modeling representations | Dec 4, 2024 | GPUSelf-Supervised Learning | CodeCode Available | 1 |
| VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models | Nov 29, 2024 | DeblurringGPU | CodeCode Available | 1 |
| Act Now: A Novel Online Forecasting Framework for Large-Scale Streaming Data | Nov 28, 2024 | GPU | CodeCode Available | 1 |
| Global Tensor Motion Planning | Nov 28, 2024 | Dataset GenerationDiversity | CodeCode Available | 1 |
| ADAF: An Artificial Intelligence Data Assimilation Framework for Weather Forecasting | Nov 25, 2024 | GPUWeather Forecasting | CodeCode Available | 1 |
| Quantization without Tears | Nov 21, 2024 | GPUQuantization | CodeCode Available | 1 |
| ITER: Iterative Transformer-based Entity Recognition and Relation Extraction | Nov 11, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| GPU-Accelerated Inverse Lithography Towards High Quality Curvy Mask Generation | Nov 11, 2024 | GPU | CodeCode Available | 1 |
| Diffusion Sampling Correction via Approximately 10 Parameters | Nov 10, 2024 | GPU | CodeCode Available | 1 |
| HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation | Nov 6, 2024 | DecoderGPU | CodeCode Available | 1 |
| LiVOS: Light Video Object Segmentation with Gated Linear Matching | Nov 5, 2024 | GPUSemantic Segmentation | CodeCode Available | 1 |
| Fast and Memory-Efficient Video Diffusion Using Streamlined Inference | Nov 2, 2024 | GPUVideo Generation | CodeCode Available | 1 |
| KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Oct 28, 2024 | GPUKnowledge Distillation | CodeCode Available | 1 |
| LOGO -- Long cOntext aliGnment via efficient preference Optimization | Oct 24, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing | Oct 24, 2024 | GPU | CodeCode Available | 1 |
| syren-new: Precise formulae for the linear and nonlinear matter power spectra with massive neutrinos and dynamical dark energy | Oct 18, 2024 | CPUGPU | CodeCode Available | 1 |
| xPerT: Extended Persistence Transformer | Oct 18, 2024 | GPU | CodeCode Available | 1 |
| EP-SAM: Weakly Supervised Histopathology Segmentation via Enhanced Prompt with Segment Anything | Oct 17, 2024 | DiagnosticGPU | CodeCode Available | 1 |
| SPA: 3D Spatial-Awareness Enables Effective Embodied Representation | Oct 10, 2024 | GPUNeural Rendering | CodeCode Available | 1 |
| Neural Reasoning Networks: Efficient Interpretable Neural Networks With Automatic Textual Explanations | Oct 10, 2024 | FairnessFeature Importance | CodeCode Available | 1 |
| PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing | Oct 7, 2024 | GPU | CodeCode Available | 1 |
| Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective | Oct 6, 2024 | CPUGPU | CodeCode Available | 1 |
| LLM-Pilot: Characterize and Optimize Performance of your LLM Inference Services | Oct 3, 2024 | BenchmarkingGPU | CodeCode Available | 1 |
| Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices | Oct 2, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model Discovery | Oct 2, 2024 | GPUModel Discovery | CodeCode Available | 1 |
| STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting | Oct 1, 2024 | GPU | CodeCode Available | 1 |
| Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language Models | Sep 28, 2024 | GPU | CodeCode Available | 1 |