| Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set | Feb 5, 2025 | Combinatorial OptimizationCPU | —Unverified | 0 |
| Fast Sampling of Cosmological Initial Conditions with Gaussian Neural Posterior Estimation | Feb 5, 2025 | GPU | —Unverified | 0 |
| Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCb | Feb 4, 2025 | GPUGraph Neural Network | CodeCode Available | 0 |
| Ilargi: a GPU Compatible Factorized ML Model Training Framework | Feb 4, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Brief analysis of DeepSeek R1 and it's implications for Generative AI | Feb 4, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization | Feb 4, 2025 | GPULarge Language Model | —Unverified | 0 |
| LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models | Feb 4, 2025 | GPUVideo Understanding | —Unverified | 0 |
| Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity | Feb 3, 2025 | Audio DenoisingDenoising | —Unverified | 0 |
| ModServe: Scalable and Resource-Efficient Large Multimodal Model Serving | Feb 2, 2025 | DecoderGPU | —Unverified | 0 |
| ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference | Feb 1, 2025 | GPUGSM8K | —Unverified | 0 |
| Recursive generalized type-2 fuzzy radial basis function neural networks for joint position estimation and adaptive EMG-based impedance control of lower limb exoskeletons | Feb 1, 2025 | Electromyography (EMG)GPU | CodeCode Available | 0 |
| Longer Attention Span: Increasing Transformer Context Length with Sparse Graph Processing Techniques | Jan 31, 2025 | GPU | CodeCode Available | 0 |
| Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models | Jan 31, 2025 | GPUModel Compression | —Unverified | 0 |
| Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected | Jan 31, 2025 | GPULanguage Modeling | —Unverified | 0 |
| LLM-based Affective Text Generation Quality Based on Different Quantization Values | Jan 31, 2025 | GPUQuantization | —Unverified | 0 |
| TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs | Jan 31, 2025 | GPU | —Unverified | 0 |
| Scaling Policy Gradient Quality-Diversity with Massive Parallelization via Behavioral Variations | Jan 30, 2025 | DiversityEvolutionary Algorithms | —Unverified | 0 |
| adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling Analysis | Jan 30, 2025 | CPUGPU | CodeCode Available | 0 |
| CrowdSplat: Exploring Gaussian Splatting For Crowd Rendering | Jan 29, 2025 | Computational EfficiencyGPU | CodeCode Available | 0 |
| Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection | Jan 29, 2025 | GPU | —Unverified | 0 |
| One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning | Jan 28, 2025 | Few-Shot LearningGPU | —Unverified | 0 |
| PISCO: Pretty Simple Compression for Retrieval-Augmented Generation | Jan 27, 2025 | GPUKnowledge Distillation | —Unverified | 0 |
| Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference | Jan 27, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Towards Scalable Topological Regularizers | Jan 24, 2025 | Domain AdaptationGPU | —Unverified | 0 |
| 3DGS^2: Near Second-order Converging 3D Gaussian Splatting | Jan 22, 2025 | 3DGS3D Reconstruction | —Unverified | 0 |
| GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models | Jan 22, 2025 | GPUQuantization | CodeCode Available | 0 |
| HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation | Jan 22, 2025 | CPUGPU | —Unverified | 0 |
| Learning Versatile Optimizers on a Compute Diet | Jan 22, 2025 | GPU | CodeCode Available | 0 |
| Irrational Complex Rotations Empower Low-bit Optimizers | Jan 22, 2025 | GPUQuantization | —Unverified | 0 |
| TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking | Jan 21, 2025 | Autonomous NavigationGPU | —Unverified | 0 |
| Pushing the Limits of BFP on Narrow Precision LLM Inference | Jan 21, 2025 | GPU | —Unverified | 0 |
| Revisiting Ensemble Methods for Stock Trading and Crypto Trading Tasks at ACM ICAIF FinRL Contest 2023-2024 | Jan 18, 2025 | Computational EfficiencyDecision Making | —Unverified | 0 |
| MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow | Jan 18, 2025 | CPUGPU | —Unverified | 0 |
| No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling | Jan 18, 2025 | CPUGPU | —Unverified | 0 |
| FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models | Jan 18, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Good things come in small packages: Should we build AI clusters with Lite-GPUs? | Jan 17, 2025 | GPUManagement | —Unverified | 0 |
| FASP: Fast and Accurate Structured Pruning of Large Language Models | Jan 16, 2025 | GPUModel Compression | —Unverified | 0 |
| PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU | Jan 16, 2025 | Benchmarkingcontinuous-control | CodeCode Available | 0 |
| The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution | Jan 16, 2025 | CPUGPU | —Unverified | 0 |
| Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement | Jan 15, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Resource-Constrained Federated Continual Learning: What Does Matter? | Jan 15, 2025 | Continual LearningGPU | —Unverified | 0 |
| GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping | Jan 15, 2025 | GPUSensor Fusion | —Unverified | 0 |
| Hierarchical Autoscaling for Large Language Model Serving with Chiron | Jan 14, 2025 | GPULanguage Modeling | —Unverified | 0 |
| Physics-Informed Latent Neural Operator for Real-time Predictions of Complex Physical Systems | Jan 14, 2025 | GPUOperator learning | —Unverified | 0 |
| Towards Lightweight Time Series Forecasting: a Patch-wise Transformer with Weak Data Enriching | Jan 14, 2025 | GPUTime Series | —Unverified | 0 |
| Keras Sig: Efficient Path Signature Computation on GPU in Keras 3 | Jan 14, 2025 | BenchmarkingC++ code | —Unverified | 0 |
| Ultra Memory-Efficient On-FPGA Training of Transformers via Tensor-Compressed Optimization | Jan 11, 2025 | Domain AdaptationGPU | —Unverified | 0 |
| EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models | Jan 10, 2025 | GPU | —Unverified | 0 |
| Benchmarking Rotary Position Embeddings for Automatic Speech Recognition | Jan 10, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Towards Early Prediction of Self-Supervised Speech Model Performance | Jan 10, 2025 | GPUSelf-Supervised Learning | —Unverified | 0 |