| FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design | Jan 25, 2024 | GPUQuantization | CodeCode Available | 3 |
| MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache | Jan 25, 2024 | GPUmodel | CodeCode Available | 3 |
| CNN architecture extraction on edge GPU | Jan 24, 2024 | GPUimage-classification | —Unverified | 0 |
| Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4 | Jan 24, 2024 | GPUIn-Context Learning | —Unverified | 0 |
| InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction | Jan 23, 2024 | 3D Semantic Occupancy PredictionAutonomous Driving | CodeCode Available | 1 |
| Edge-Enabled Real-time Railway Track Segmentation | Jan 21, 2024 | GPUQuantization | —Unverified | 0 |
| immrax: A Parallelizable and Differentiable Toolbox for Interval Analysis and Mixed Monotone Reachability in JAX | Jan 21, 2024 | Computational EfficiencyGPU | CodeCode Available | 1 |
| A Lightweight FPGA-based IDS-ECU Architecture for Automotive CAN | Jan 19, 2024 | GPUIntrusion Detection | —Unverified | 0 |
| Enhancing Scalability in Recommender Systems through Lottery Ticket Hypothesis and Knowledge Distillation-based Neural Network Pruning | Jan 19, 2024 | GPUKnowledge Distillation | —Unverified | 0 |
| Exact analytical algorithm for solvent accessible surface area and derivatives in implicit solvent molecular simulations on GPUs | Jan 19, 2024 | CPUGPU | —Unverified | 0 |