| Reinforcement learning with learned gadgets to tackle hard quantum problems on real hardware | Oct 31, 2024 | GPUProgram Synthesis | CodeCode Available | 0 |
| A Novel Breast Ultrasound Image Augmentation Method Using Advanced Neural Style Transfer: An Efficient and Explainable Approach | Oct 31, 2024 | GPUImage Augmentation | —Unverified | 0 |
| The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains | Oct 31, 2024 | GPUPhilosophy | CodeCode Available | 2 |
| Context-Aware Token Selection and Packing for Enhanced Vision Transformer | Oct 31, 2024 | GPUobject-detection | —Unverified | 0 |
| Cycle-Constrained Adversarial Denoising Convolutional Network for PET Image Denoising: Multi-Dimensional Validation on Large Datasets with Reader Study and Real Low-Dose Data | Oct 31, 2024 | DenoisingGPU | —Unverified | 0 |
| Very fast Bayesian Additive Regression Trees on GPU | Oct 30, 2024 | CPUGPU | CodeCode Available | 2 |
| $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources | Oct 30, 2024 | GPU | CodeCode Available | 2 |
| A Message Passing Neural Network Surrogate Model for Bond-Associated Peridynamic Material Correspondence Formulation | Oct 29, 2024 | GPU | —Unverified | 0 |
| AI-assisted Agile Propagation Modeling for Real-time Digital Twin Wireless Networks | Oct 29, 2024 | Computational EfficiencyCPU | —Unverified | 0 |
| Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization | Oct 29, 2024 | GPURetrieval | —Unverified | 0 |
| VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration | Oct 29, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Motion Graph Unleashed: A Novel Approach to Video Prediction | Oct 29, 2024 | GPUOptical Flow Estimation | CodeCode Available | 0 |
| Memory-Efficient Point Cloud Registration via Overlapping Region Sampling | Oct 29, 2024 | GPUPoint Cloud Registration | —Unverified | 0 |
| Revisiting Reliability in Large-Scale Machine Learning Research Clusters | Oct 29, 2024 | GPU | —Unverified | 0 |
| Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs | Oct 29, 2024 | GPURecommendation Systems | CodeCode Available | 0 |
| Data Generation for Hardware-Friendly Post-Training Quantization | Oct 29, 2024 | Data AugmentationGPU | CodeCode Available | 3 |
| ProMoE: Fast MoE-based LLM Serving using Proactive Caching | Oct 29, 2024 | GPUMixture-of-Experts | —Unverified | 0 |
| ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference | Oct 28, 2024 | CPU | CodeCode Available | 3 |
| Accelerated Bayesian parameter estimation and model selection for gravitational waves with normalizing flows | Oct 28, 2024 | CPUGPU | —Unverified | 0 |
| FusedInf: Efficient Swapping of DNN Models for On-Demand Serverless Inference Services on the Edge | Oct 28, 2024 | GPU | CodeCode Available | 0 |
| Modular Duality in Deep Learning | Oct 28, 2024 | Deep LearningGPU | CodeCode Available | 3 |
| KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation | Oct 28, 2024 | GPUKnowledge Distillation | CodeCode Available | 1 |
| ThunderKittens: Simple, Fast, and Adorable AI Kernels | Oct 27, 2024 | GPUState Space Models | CodeCode Available | 7 |
| Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading | Oct 26, 2024 | CPUGPU | CodeCode Available | 0 |
| Computational Bottlenecks of Training Small-scale Large Language Models | Oct 25, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies | Oct 24, 2024 | GPUparameter-efficient fine-tuning | —Unverified | 0 |
| KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing | Oct 24, 2024 | GPU | CodeCode Available | 1 |
| Sort-free Gaussian Splatting via Weighted Sum Rendering | Oct 24, 2024 | 3DGS3D Scene Reconstruction | —Unverified | 0 |
| LOGO -- Long cOntext aliGnment via efficient preference Optimization | Oct 24, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search | Oct 24, 2024 | ClusteringGPU | CodeCode Available | 2 |
| Trajectory Optimization for Spatial Microstructure Control in Electron Beam Metal Additive Manufacturing | Oct 23, 2024 | GPU | —Unverified | 0 |
| CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation | Oct 23, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs | Oct 23, 2024 | GPUScheduling | —Unverified | 0 |
| POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference | Oct 23, 2024 | GPU | CodeCode Available | 0 |
| ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference | Oct 23, 2024 | Computational EfficiencyCPU | —Unverified | 0 |
| AI-focused HPC Data Centers Can Provide More Power Grid Flexibility and at Lower Cost | Oct 22, 2024 | CPUGPU | —Unverified | 0 |
| Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss | Oct 22, 2024 | GPURepresentation Learning | CodeCode Available | 3 |
| Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling | Oct 22, 2024 | AllGPU | —Unverified | 0 |
| Semantic-guided Search for Efficient Program Repair with Large Language Models | Oct 22, 2024 | GPUHumanEval | —Unverified | 0 |
| FastAttention: Extend FlashAttention2 to NPUs and Low-resource GPUs | Oct 22, 2024 | CPUGPU | —Unverified | 0 |
| MagicPIG: LSH Sampling for Efficient LLM Generation | Oct 21, 2024 | CPUGPU | CodeCode Available | 3 |
| Mean-Field Simulation-Based Inference for Cosmological Initial Conditions | Oct 21, 2024 | GPUNavigate | —Unverified | 0 |
| Enabling Energy-Efficient Deployment of Large Language Models on Memristor Crossbar: A Synergy of Large and Small | Oct 21, 2024 | GPU | —Unverified | 0 |
| Fully Explicit Dynamic Gaussian Splatting | Oct 21, 2024 | GPUNovel View Synthesis | —Unverified | 0 |
| CompAct: Compressed Activations for Memory-Efficient LLM Training | Oct 20, 2024 | GPU | —Unverified | 0 |
| A Remedy to Compute-in-Memory with Dynamic Random Access Memory: 1FeFET-1C Technology for Neuro-Symbolic AI | Oct 20, 2024 | GPU | —Unverified | 0 |
| SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation | Oct 19, 2024 | DiagnosticGPU | CodeCode Available | 0 |
| Accelerate Coastal Ocean Circulation Model with AI Surrogate | Oct 19, 2024 | CPUDisaster Response | —Unverified | 0 |
| Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step | Oct 19, 2024 | Conditional Image GenerationGPU | CodeCode Available | 2 |
| AC-Mix: Self-Supervised Adaptation for Low-Resource Automatic Speech Recognition using Agnostic Contrastive Mixup | Oct 18, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |