| LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism | Apr 15, 2024 | GPU | CodeCode Available | 2 |
| LoopAnimate: Loopable Salient Object Animation | Apr 14, 2024 | GPUObject | —Unverified | 0 |
| Reducing the Barriers to Entry for Foundation Model Training | Apr 12, 2024 | GPU | —Unverified | 0 |
| CATS: Contextually-Aware Thresholding for Sparsity in Large Language Models | Apr 12, 2024 | GPU | CodeCode Available | 1 |
| Detecting AI-Generated Images via CLIP | Apr 12, 2024 | GPU | —Unverified | 0 |
| Mitigating Challenges of the Space Environment for Onboard Artificial Intelligence: Design Overview of the Imaging Payload on SpIRIT | Apr 12, 2024 | Edge-computingGPU | —Unverified | 0 |
| Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models | Apr 11, 2024 | GPUIn-Context Learning | —Unverified | 0 |
| JetMoE: Reaching Llama2 Performance with 0.1M Dollars | Apr 11, 2024 | GPUMixture-of-Experts | CodeCode Available | 4 |
| Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding | Apr 10, 2024 | GPULanguage Modeling | CodeCode Available | 1 |
| Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic | Apr 10, 2024 | GPU | CodeCode Available | 2 |
| GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA | Apr 10, 2024 | CPUGPU | —Unverified | 0 |
| PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System | Apr 10, 2024 | CPUDistributed Optimization | CodeCode Available | 0 |
| YOLO based Ocean Eddy Localization with AWS SageMaker | Apr 10, 2024 | GPUManagement | —Unverified | 0 |
| FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models | Apr 9, 2024 | FairnessGPU | CodeCode Available | 1 |
| LIPT: Latency-aware Image Processing Transformer | Apr 9, 2024 | DenoisingGPU | CodeCode Available | 1 |
| LATUP-Net: A Lightweight 3D Attention U-Net with Parallel Convolutions for Brain Tumor Segmentation | Apr 9, 2024 | Brain Tumor SegmentationGPU | —Unverified | 0 |
| ApproxDARTS: Differentiable Neural Architecture Search with Approximate Multipliers | Apr 8, 2024 | GPUNeural Architecture Search | —Unverified | 0 |
| MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Apr 8, 2024 | GPUMultiple-choice | CodeCode Available | 3 |
| Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models | Apr 8, 2024 | GPUMixture-of-Experts | —Unverified | 0 |
| Allo: A Programming Model for Composable Accelerator Design | Apr 7, 2024 | GPUHigh-Level Synthesis | CodeCode Available | 3 |
| Tensorized Ant Colony Optimization for GPU Acceleration | Apr 7, 2024 | CPUGPU | CodeCode Available | 1 |
| Data Stream Sampling with Fuzzy Task Boundaries and Noisy Labels | Apr 7, 2024 | Continual LearningFairness | CodeCode Available | 0 |
| GNNBENCH: Fair and Productive Benchmarking for Single-GPU GNN System | Apr 5, 2024 | BenchmarkingGPU | —Unverified | 0 |
| OmniGS: Fast Radiance Field Reconstruction using Omnidirectional Gaussian Splatting | Apr 4, 2024 | GPU | CodeCode Available | 2 |
| Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization | Apr 4, 2024 | GPULanguage Modeling | CodeCode Available | 0 |