| Atom: Low-bit Quantization for Efficient and Accurate LLM Serving | Oct 29, 2023 | GPUQuantization | CodeCode Available | 2 |
| FP8-LM: Training FP8 Large Language Models | Oct 27, 2023 | GPU | CodeCode Available | 2 |
| QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models | Oct 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 2 |
| LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation | Oct 16, 2023 | GPUImage Animation | CodeCode Available | 2 |
| ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models | Oct 16, 2023 | General Reinforcement LearningGPU | CodeCode Available | 2 |
| Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes | Oct 12, 2023 | GPUNovel View Synthesis | CodeCode Available | 2 |
| GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models | Oct 12, 2023 | GPUText to 3D | CodeCode Available | 2 |
| DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training | Oct 5, 2023 | GPU | CodeCode Available | 2 |
| MEM: Multi-Modal Elevation Mapping for Robotics and Learning | Sep 28, 2023 | ColorizationGPU | CodeCode Available | 2 |
| ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers | Sep 28, 2023 | GPUInstruction Following | CodeCode Available | 2 |
| OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control | Sep 22, 2023 | GPUreinforcement-learning | CodeCode Available | 2 |
| Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity | Sep 19, 2023 | GPU | CodeCode Available | 2 |
| CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra | Sep 6, 2023 | CoLAGaussian Processes | CodeCode Available | 2 |
| CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs | Aug 29, 2023 | CPUGPU | CodeCode Available | 2 |
| OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models | Aug 25, 2023 | Common Sense ReasoningComputational Efficiency | CodeCode Available | 2 |
| FastSurfer-HypVINN: Automated sub-segmentation of the hypothalamus and adjacent structures on high-resolutional brain MRI | Aug 24, 2023 | GPUSegmentation | CodeCode Available | 2 |
| Platypus: Quick, Cheap, and Powerful Refinement of LLMs | Aug 14, 2023 | GPU | CodeCode Available | 2 |
| Machine-learned molecular mechanics force field for the simulation of protein-ligand systems and beyond | Jul 13, 2023 | Drug DesignDrug Discovery | CodeCode Available | 2 |
| Differentiable Forward Projector for X-ray Computed Tomography | Jul 11, 2023 | CT ReconstructionDeep Learning | CodeCode Available | 2 |
| InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval | Jul 10, 2023 | GPUInformation Retrieval | CodeCode Available | 2 |
| cuSLINK: Single-linkage Agglomerative Clustering on the GPU | Jun 28, 2023 | ClusteringGPU | CodeCode Available | 2 |
| LeanDojo: Theorem Proving with Retrieval-Augmented Language Models | Jun 27, 2023 | Automated Theorem ProvingGPU | CodeCode Available | 2 |
| DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome | Jun 26, 2023 | Computational EfficiencyCore Promoter Detection | CodeCode Available | 2 |
| H_2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models | Jun 24, 2023 | GPU | CodeCode Available | 2 |
| RoMe: Towards Large Scale Road Surface Reconstruction via Mesh Representation | Jun 20, 2023 | Autonomous DrivingComputational Efficiency | CodeCode Available | 2 |