| Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud | Nov 23, 2024 | GPULanguage Modeling | —Unverified | 0 |
| Reassessing Layer Pruning in LLMs: New Insights and Methods | Nov 23, 2024 | BenchmarkingGPU | CodeCode Available | 0 |
| Multi-scale Cascaded Large-Model for Whole-body ROI Segmentation | Nov 23, 2024 | Computational EfficiencyGPU | CodeCode Available | 0 |
| Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers | Nov 22, 2024 | Data AugmentationGPU | —Unverified | 0 |
| Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting | Nov 21, 2024 | GPU | —Unverified | 0 |
| Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction | Nov 21, 2024 | 3D GenerationGPU | —Unverified | 0 |
| Deep operator network models for predicting post-burn contraction | Nov 21, 2024 | CPUGPU | —Unverified | 0 |
| Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training | Nov 20, 2024 | GPU | —Unverified | 0 |
| FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting | Nov 20, 2024 | Dimensionality ReductionGPU | —Unverified | 0 |
| Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local Learning | Nov 19, 2024 | GPU | —Unverified | 0 |