| HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models | May 16, 2024 | GPULanguage Modelling | CodeCode Available | 1 |
| No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding | May 14, 2024 | Action DetectionGPU | CodeCode Available | 1 |
| Computation-Aware Kalman Filtering and Smoothing | May 14, 2024 | GPU | CodeCode Available | 1 |
| The Developing Human Connectome Project: A Fast Deep Learning-based Pipeline for Neonatal Cortical Surface Reconstruction | May 14, 2024 | GPUSurface Reconstruction | CodeCode Available | 1 |
| Differentiable Model Scaling using Differentiable Topk | May 12, 2024 | GPUimage-classification | CodeCode Available | 1 |
| CoSense3D: an Agent-based Efficient Learning Framework for Collective Perception | Apr 29, 2024 | Data VisualizationDecision Making | CodeCode Available | 1 |
| LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report | Apr 29, 2024 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method | Apr 23, 2024 | DenoisingGPU | CodeCode Available | 1 |
| Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity | Apr 22, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| Evaluating Retrieval Quality in Retrieval-Augmented Generation | Apr 21, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| LLMem: Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLMs | Apr 16, 2024 | DecoderGPU | CodeCode Available | 1 |
| Interpolating neural network: A novel unification of machine learning and interpolation theory | Apr 16, 2024 | GPUPhysical Simulations | CodeCode Available | 1 |
| CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models | Apr 12, 2024 | GPU | CodeCode Available | 1 |
| Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding | Apr 10, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models | Apr 9, 2024 | FairnessGPU | CodeCode Available | 1 |
| LIPT: Latency-aware Image Processing Transformer | Apr 9, 2024 | DenoisingGPU | CodeCode Available | 1 |
| Tensorized Ant Colony Optimization for GPU Acceleration | Apr 7, 2024 | CPUGPU | CodeCode Available | 1 |
| GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU | Apr 3, 2024 | GPUGraph Neural Network | CodeCode Available | 1 |
| IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT | Apr 2, 2024 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| Taming Lookup Tables for Efficient Image Retouching | Mar 28, 2024 | CPUGPU | CodeCode Available | 1 |
| Siamese Vision Transformers are Scalable Audio-visual Learners | Mar 28, 2024 | Contrastive LearningGPU | CodeCode Available | 1 |
| ModeTv2: GPU-accelerated Motion Decomposition Transformer for Pairwise Optimization in Medical Image Registration | Mar 25, 2024 | Computational EfficiencyGPU | CodeCode Available | 1 |
| MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models | Mar 25, 2024 | GPUIn-Context Learning | CodeCode Available | 1 |
| MEDDAP: Medical Dataset Enhancement via Diversified Augmentation Pipeline | Mar 25, 2024 | GPU | CodeCode Available | 1 |
| Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression | Mar 23, 2024 | Dimensionality ReductionGPU | CodeCode Available | 1 |