| FusionANNS: An Efficient CPU/GPU Cooperative Processing Architecture for Billion-scale Approximate Nearest Neighbor Search | Sep 25, 2024 | Collaborative FilteringCPU | —Unverified | 0 |
| CNN Mixture-of-Depths | Sep 25, 2024 | Computational EfficiencyCPU | —Unverified | 0 |
| INT-FlashAttention: Enabling Flash Attention for INT8 Quantization | Sep 25, 2024 | GPUQuantization | CodeCode Available | 2 |
| Textless NLP -- Zero Resource Challenge with Low Resource Compute | Sep 24, 2024 | Acoustic Unit DiscoveryGPU | —Unverified | 0 |
| CAD: Memory Efficient Convolutional Adapter for Segment Anything | Sep 24, 2024 | DecoderGPU | CodeCode Available | 1 |
| A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation | Sep 24, 2024 | GPUMulti-Task Learning | —Unverified | 0 |
| Efficient Motion Prediction: A Lightweight & Accurate Trajectory Prediction Model With Fast Training and Inference Speed | Sep 24, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 |
| dnaGrinder: a lightweight and high-capacity genomic foundation model | Sep 24, 2024 | DecoderGPU | —Unverified | 0 |
| PipeFill: Using GPUs During Bubbles in Pipeline-parallel LLM Training | Sep 23, 2024 | 8kGPU | —Unverified | 0 |
| TextToon: Real-Time Text Toonify Head Avatar from Single Video | Sep 23, 2024 | Contrastive LearningGPU | —Unverified | 0 |
| Efficient Tabular Data Preprocessing of ML Pipelines | Sep 23, 2024 | CPUGPU | —Unverified | 0 |
| Benchmarking Edge AI Platforms for High-Performance ML Inference | Sep 23, 2024 | BenchmarkingCPU | —Unverified | 0 |
| FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale | Sep 23, 2024 | GPU | CodeCode Available | 1 |
| A Realistic Simulation Framework for Analog/Digital Neuromorphic Architectures | Sep 23, 2024 | Edge-computingGPU | —Unverified | 0 |
| Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding | Sep 22, 2024 | Anomaly DetectionGPU | CodeCode Available | 4 |
| ProTEA: Programmable Transformer Encoder Acceleration on FPGA | Sep 21, 2024 | GPUMachine Translation | —Unverified | 0 |
| FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs | Sep 21, 2024 | CPUGPU | —Unverified | 0 |
| Drift to Remember | Sep 21, 2024 | GPUimage-classification | —Unverified | 0 |
| On Importance of Pruning and Distillation for Efficient Low Resource NLP | Sep 21, 2024 | Document ClassificationGPU | —Unverified | 0 |
| Optimizing RLHF Training for Large Language Models with Stage Fusion | Sep 20, 2024 | GPU | —Unverified | 0 |
| Occupancy-Based Dual Contouring | Sep 20, 2024 | 3D ReconstructionGPU | CodeCode Available | 2 |
| Enhancing Performance and Scalability of Large-Scale Recommendation Systems with Jagged Flash Attention | Sep 19, 2024 | GPURecommendation Systems | —Unverified | 0 |
| 3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt | Sep 19, 2024 | 3DGSGPU | CodeCode Available | 3 |
| CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs | Sep 19, 2024 | GPU | CodeCode Available | 1 |
| Graph Convolutional Neural Networks as Surrogate Models for Climate Simulation | Sep 19, 2024 | GPUUncertainty Quantification | —Unverified | 0 |