| CAT: A Conditional Adaptation Tailor for Efficient and Effective Instance-Specific Pansharpening on Real-World Data | Apr 14, 2025 | Computational EfficiencyGPU | —Unverified | 0 |
| Anchors no more: Using peculiar velocities to constrain H_0 and the primordial Universe without calibrators | Apr 14, 2025 | GPU | CodeCode Available | 0 |
| Frozen Layers: Memory-efficient Many-fidelity Hyperparameter Optimization | Apr 14, 2025 | GPUHyperparameter Optimization | —Unverified | 0 |
| aweSOM: a CPU/GPU-accelerated Self-organizing Map and Statistically Combined Ensemble Framework for Machine-learning Clustering Analysis | Apr 13, 2025 | CPUGPU | —Unverified | 0 |
| MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| EO-VLM: VLM-Guided Energy Overload Attacks on Vision Models | Apr 11, 2025 | Autonomous DrivingGPU | —Unverified | 0 |
| Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model | Apr 11, 2025 | GPUVideo Generation | —Unverified | 0 |
| SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting | Apr 11, 2025 | GPULanguage Modeling | —Unverified | 0 |
| Spectral Normalization for Lipschitz-Constrained Policies on Learning Humanoid Locomotion | Apr 11, 2025 | GPUReinforcement Learning (RL) | —Unverified | 0 |
| Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models | Apr 11, 2025 | channel selectionGPU | —Unverified | 0 |
| DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction | Apr 10, 2025 | GPUPrediction | —Unverified | 0 |
| GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable | Apr 10, 2025 | GPUMath | —Unverified | 0 |
| PoGO: A Scalable Proof of Useful Work via Quantized Gradient Descent and Merkle Proofs | Apr 10, 2025 | GPUQuantization | —Unverified | 0 |
| Search-contempt: a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency | Apr 10, 2025 | Computational EfficiencyGPU | —Unverified | 0 |
| A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology | Apr 9, 2025 | Cell DetectionComputational Efficiency | CodeCode Available | 0 |
| CRYSIM: Prediction of Symmetric Structures of Large Crystals with GPU-based Ising Machines | Apr 9, 2025 | Bayesian OptimizationGPU | CodeCode Available | 0 |
| Nonuniform-Tensor-Parallelism: Mitigating GPU failure impact for Scaled-up LLM Training | Apr 8, 2025 | GPU | —Unverified | 0 |
| Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching | Apr 8, 2025 | GPUScheduling | —Unverified | 0 |
| PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters | Apr 7, 2025 | CPUGPU | CodeCode Available | 0 |
| SmolVLM: Redefining small and efficient multimodal models | Apr 7, 2025 | GPU | —Unverified | 0 |
| Leveraging State Space Models in Long Range Genomics | Apr 7, 2025 | BenchmarkingGPU | —Unverified | 0 |
| Stereo-LiDAR Fusion by Semi-Global Matching With Discrete Disparity-Matching Cost and Semidensification | Apr 7, 2025 | Depth EstimationGPU | CodeCode Available | 0 |
| Multimodal Cinematic Video Synthesis Using Text-to-Image and Audio Generation Models | Apr 6, 2025 | Audio GenerationGPU | —Unverified | 0 |
| SLOs-Serve: Optimized Serving of Multi-SLO LLMs | Apr 5, 2025 | ChatbotGPU | —Unverified | 0 |
| HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs | Apr 4, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis | Apr 4, 2025 | CPUGPU | —Unverified | 0 |
| DeepOHeat-v1: Efficient Operator Learning for Fast and Trustworthy Thermal Simulation and Optimization in 3D-IC Design | Apr 4, 2025 | GPUKolmogorov-Arnold Networks | CodeCode Available | 0 |
| MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism | Apr 3, 2025 | CPUGPU | —Unverified | 0 |
| Incorporating the ChEES Criterion into Sequential Monte Carlo Samplers | Apr 3, 2025 | Bayesian InferenceGPU | —Unverified | 0 |
| A Truncated Newton Method for Optimal Transport | Apr 2, 2025 | GPU | CodeCode Available | 0 |
| Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries | Apr 2, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| FlowR: Flowing from Sparse to Dense 3D Reconstructions | Apr 2, 2025 | GPUNovel View Synthesis | —Unverified | 0 |
| SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching | Apr 1, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models | Apr 1, 2025 | CPUGPU | —Unverified | 0 |
| Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources | Apr 1, 2025 | GPULarge Language Model | —Unverified | 0 |
| Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation | Mar 31, 2025 | GPUImage Segmentation | —Unverified | 0 |
| GPU-centric Communication Schemes for HPC and ML Applications | Mar 31, 2025 | CPUGPU | —Unverified | 0 |
| StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting | Mar 31, 2025 | 3DGSGPU | —Unverified | 0 |
| Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments | Mar 31, 2025 | CPUGPU | —Unverified | 0 |
| Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables | Mar 31, 2025 | 8kComputational Efficiency | —Unverified | 0 |
| Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training | Mar 31, 2025 | GPULanguage Modeling | —Unverified | 0 |
| Cocktail: Chunk-Adaptive Mixed-Precision Quantization for Long-Context LLM Inference | Mar 30, 2025 | GPUQuantization | —Unverified | 0 |
| CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction | Mar 29, 2025 | GPU | —Unverified | 0 |
| PartialLoading: User Scheduling and Bandwidth Allocation for Parameter-sharing Edge Inference | Mar 29, 2025 | GPUScheduling | —Unverified | 0 |
| Disentangled 4D Gaussian Splatting: Towards Faster and More Efficient Dynamic Scene Rendering | Mar 28, 2025 | 3DGSGPU | —Unverified | 0 |
| Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments | Mar 28, 2025 | GPUScene Generation | —Unverified | 0 |
| ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model | Mar 27, 2025 | GPUVideo Generation | —Unverified | 0 |
| FACETS: Efficient Once-for-all Object Detection via Constrained Iterative Search | Mar 27, 2025 | AllGPU | —Unverified | 0 |
| Lobster: A GPU-Accelerated Framework for Neurosymbolic Programming | Mar 27, 2025 | GPU | —Unverified | 0 |