| A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library | Dec 19, 2023 | GPU | CodeCode Available | 2 |
| XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX | Dec 19, 2023 | DiversityGPU | CodeCode Available | 2 |
| Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning | Dec 18, 2023 | Domain AdaptationGPU | —Unverified | 0 |
| GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View Synthesis | Dec 18, 2023 | GPUInductive Bias | —Unverified | 0 |
| PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU | Dec 16, 2023 | CPUGPU | CodeCode Available | 5 |
| Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs | Dec 16, 2023 | GPUScheduling | CodeCode Available | 1 |
| RetailKLIP : Finetuning OpenCLIP backbone using metric learning on a single GPU for Zero-shot retail product image classification | Dec 16, 2023 | GPUimage-classification | —Unverified | 0 |
| FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline | Dec 15, 2023 | GPUKnowledge Distillation | —Unverified | 0 |
| Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models | Dec 15, 2023 | BenchmarkingCode Summarization | CodeCode Available | 1 |
| Data-Efficient Multimodal Fusion on a Single GPU | Dec 15, 2023 | GPUImage Retrieval | CodeCode Available | 1 |