| Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures | Apr 16, 2025 | CPUGPU | —Unverified | 0 |
| Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models | Apr 15, 2025 | DenoisingGPU | —Unverified | 0 |
| 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float | Apr 15, 2025 | CPUGPU | CodeCode Available | 4 |
| ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators | Apr 15, 2025 | GPU | —Unverified | 0 |
| PatrolVision: Automated License Plate Recognition in the wild | Apr 15, 2025 | Autonomous DrivingGPU | —Unverified | 0 |
| Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints | Apr 15, 2025 | GPUInference Optimization | CodeCode Available | 4 |
| CAT: A Conditional Adaptation Tailor for Efficient and Effective Instance-Specific Pansharpening on Real-World Data | Apr 14, 2025 | Computational EfficiencyGPU | —Unverified | 0 |
| Anchors no more: Using peculiar velocities to constrain H_0 and the primordial Universe without calibrators | Apr 14, 2025 | GPU | CodeCode Available | 0 |
| Frozen Layers: Memory-efficient Many-fidelity Hyperparameter Optimization | Apr 14, 2025 | GPUHyperparameter Optimization | —Unverified | 0 |
| Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images | Apr 13, 2025 | GPU | CodeCode Available | 2 |
| aweSOM: a CPU/GPU-accelerated Self-organizing Map and Statistically Combined Ensemble Framework for Machine-learning Clustering Analysis | Apr 13, 2025 | CPUGPU | —Unverified | 0 |
| Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models | Apr 11, 2025 | channel selectionGPU | —Unverified | 0 |
| Spectral Normalization for Lipschitz-Constrained Policies on Learning Humanoid Locomotion | Apr 11, 2025 | GPUReinforcement Learning (RL) | —Unverified | 0 |
| SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting | Apr 11, 2025 | GPULanguage Modeling | —Unverified | 0 |
| TensorNEAT: A GPU-accelerated Library for NeuroEvolution of Augmenting Topologies | Apr 11, 2025 | Computational EfficiencyGPU | CodeCode Available | 3 |
| Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model | Apr 11, 2025 | GPUVideo Generation | —Unverified | 0 |
| MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications | Apr 11, 2025 | GPU | CodeCode Available | 3 |
| TorchFX: A modern approach to Audio DSP with PyTorch and GPU acceleration | Apr 11, 2025 | Audio Signal ProcessingBenchmarking | CodeCode Available | 2 |
| EO-VLM: VLM-Guided Energy Overload Attacks on Vision Models | Apr 11, 2025 | Autonomous DrivingGPU | —Unverified | 0 |
| Search-contempt: a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency | Apr 10, 2025 | Computational EfficiencyGPU | —Unverified | 0 |
| PoGO: A Scalable Proof of Useful Work via Quantized Gradient Descent and Merkle Proofs | Apr 10, 2025 | GPUQuantization | —Unverified | 0 |
| DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction | Apr 10, 2025 | GPUPrediction | —Unverified | 0 |
| Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving | Apr 10, 2025 | GPULarge Language Model | CodeCode Available | 1 |