| Revisiting Ensemble Methods for Stock Trading and Crypto Trading Tasks at ACM ICAIF FinRL Contest 2023-2024 | Jan 18, 2025 | Computational EfficiencyDecision Making | —Unverified | 0 |
| No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling | Jan 18, 2025 | CPUGPU | —Unverified | 0 |
| FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models | Jan 18, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Good things come in small packages: Should we build AI clusters with Lite-GPUs? | Jan 17, 2025 | GPUManagement | —Unverified | 0 |
| PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU | Jan 16, 2025 | Benchmarkingcontinuous-control | CodeCode Available | 0 |
| The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution | Jan 16, 2025 | CPUGPU | —Unverified | 0 |
| FASP: Fast and Accurate Structured Pruning of Large Language Models | Jan 16, 2025 | GPUModel Compression | —Unverified | 0 |
| Resource-Constrained Federated Continual Learning: What Does Matter? | Jan 15, 2025 | Continual LearningGPU | —Unverified | 0 |
| GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping | Jan 15, 2025 | GPUSensor Fusion | —Unverified | 0 |
| Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement | Jan 15, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Towards Lightweight Time Series Forecasting: a Patch-wise Transformer with Weak Data Enriching | Jan 14, 2025 | GPUTime Series | —Unverified | 0 |
| Keras Sig: Efficient Path Signature Computation on GPU in Keras 3 | Jan 14, 2025 | BenchmarkingC++ code | —Unverified | 0 |
| Physics-Informed Latent Neural Operator for Real-time Predictions of Complex Physical Systems | Jan 14, 2025 | GPUOperator learning | —Unverified | 0 |
| CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Jan 14, 2025 | Deep Reinforcement LearningGPU | CodeCode Available | 1 |
| Hierarchical Autoscaling for Large Language Model Serving with Chiron | Jan 14, 2025 | GPULanguage Modeling | —Unverified | 0 |
| A User's Guide to KSig: GPU-Accelerated Computation of the Signature Kernel | Jan 13, 2025 | GPU | CodeCode Available | 2 |
| Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution | Jan 12, 2025 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping | Jan 11, 2025 | GPULarge Language Model | CodeCode Available | 1 |
| Ultra Memory-Efficient On-FPGA Training of Transformers via Tensor-Compressed Optimization | Jan 11, 2025 | Domain AdaptationGPU | —Unverified | 0 |
| Towards Early Prediction of Self-Supervised Speech Model Performance | Jan 10, 2025 | GPUSelf-Supervised Learning | —Unverified | 0 |
| TakuNet: an Energy-Efficient CNN for Real-Time Inference on Embedded UAV systems in Emergency Response Scenarios | Jan 10, 2025 | Aerial Scene ClassificationCPU | CodeCode Available | 2 |
| MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection | Jan 10, 2025 | Action DetectionGPU | CodeCode Available | 1 |
| Benchmarking Rotary Position Embeddings for Automatic Speech Recognition | Jan 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models | Jan 10, 2025 | GPU | —Unverified | 0 |
| Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters | Jan 9, 2025 | GPUScheduling | —Unverified | 0 |