| The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation | Apr 7, 2025 | Inference OptimizationReferring Video Object Segmentation | CodeCode Available | 5 |
| Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints | Apr 15, 2025 | GPUInference Optimization | CodeCode Available | 4 |
| SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL | Apr 15, 2025 | Inference Optimization | CodeCode Available | 3 |
| A Survey on Inference Optimization Techniques for Mixture of Experts Models | Dec 18, 2024 | Computational EfficiencyDistributed Computing | CodeCode Available | 3 |
| Inference Performance Optimization for Large Language Models on CPUs | Jul 10, 2024 | CPUGPU | CodeCode Available | 3 |
| CycleBNN: Cyclic Precision Training in Binary Neural Networks | Sep 28, 2024 | Inference Optimization | CodeCode Available | 2 |
| Painterly Image Harmonization using Diffusion Model | Aug 4, 2023 | Generative Adversarial NetworkImage Harmonization | CodeCode Available | 1 |
| A Novel 1D State Space for Efficient Music Rhythmic Analysis | Nov 1, 2021 | Inference OptimizationOnline Beat Tracking | CodeCode Available | 1 |
| Adaptive Deep Neural Network Inference Optimization with EENet | Jan 15, 2023 | Inference OptimizationScheduling | CodeCode Available | 1 |
| Easy and Efficient Transformer : Scalable Inference Solution For large NLP model | Apr 26, 2021 | DecoderGPU | CodeCode Available | 1 |
| ADJUST: A Dictionary-Based Joint Reconstruction and Unmixing Method for Spectral Tomography | Dec 21, 2021 | 3D ReconstructionComputed Tomography (CT) | CodeCode Available | 1 |
| DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis | Feb 10, 2025 | CPUInference Optimization | —Unverified | 0 |
| Easy and Efficient Transformer: Scalable Inference Solution For Large NLP Model | Jul 1, 2022 | DecoderGPU | —Unverified | 0 |
| EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge | Oct 16, 2024 | Deep LearningInference Optimization | —Unverified | 0 |
| Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks | May 20, 2024 | Inference OptimizationKnowledge Distillation | —Unverified | 0 |
| Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification | Feb 23, 2025 | ClassificationInference Optimization | —Unverified | 0 |
| Faster MoE LLM Inference for Extremely Large Models | May 6, 2025 | Inference OptimizationMixture-of-Experts | —Unverified | 0 |
| Federated Learning While Providing Model as a Service: Joint Training and Inference Optimization | Dec 20, 2023 | Federated LearningInference Optimization | —Unverified | 0 |
| FluidML: Fast and Memory Efficient Inference Optimization | Nov 14, 2024 | Autonomous VehiclesInference Optimization | —Unverified | 0 |
| Hellinger-Kantorovich Gradient Flows: Global Exponential Decay of Entropy Functionals | Jan 28, 2025 | Inference Optimization | —Unverified | 0 |
| Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization | Feb 14, 2025 | GSM8KInference Optimization | —Unverified | 0 |
| Inference Optimization of Foundation Models on AI Accelerators | Jul 12, 2024 | Inference OptimizationModel Compression | —Unverified | 0 |
| The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities | Aug 23, 2024 | Computational EfficiencyInference Optimization | —Unverified | 0 |
| An approach to optimize inference of the DIART speaker diarization pipeline | Aug 5, 2024 | Inference OptimizationKnowledge Distillation | —Unverified | 0 |
| A bi-partite generative model framework for analyzing and simulating large scale multiple discrete-continuous travel behaviour data | Jan 18, 2019 | Bayesian InferenceBIG-bench Machine Learning | —Unverified | 0 |
| Advances and Open Challenges in Federated Foundation Models | Apr 23, 2024 | Computational EfficiencyFederated Learning | —Unverified | 0 |
| Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition | Dec 6, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization | Aug 3, 2021 | Inference OptimizationKeyword Spotting | —Unverified | 0 |
| Residual-Based Error Corrector Operator to Enhance Accuracy and Reliability of Neural Operator Surrogates of Nonlinear Variational Boundary-Value Problems | Jun 21, 2023 | Inference Optimization | —Unverified | 0 |
| CRVI: Convex Relaxation for Variational Inference | Jul 1, 2018 | Inference Optimizationregression | —Unverified | 0 |
| Deep Signal Recovery with One-Bit Quantization | Nov 30, 2018 | BIG-bench Machine LearningComputational Efficiency | —Unverified | 0 |
| Developing efficient transfer learning strategies for robust scene recognition in mobile robotics using pre-trained convolutional neural networks | Jul 23, 2021 | Data AugmentationInference Optimization | —Unverified | 0 |
| DSMentor: Enhancing Data Science Agents with Curriculum Learning and Online Knowledge Accumulation | May 20, 2025 | In-Context LearningInference Optimization | —Unverified | 0 |
| Learning to Infer | Jan 1, 2018 | Inference Optimization | —Unverified | 0 |
| Networked Signal and Information Processing | Oct 25, 2022 | Decision MakingInference Optimization | —Unverified | 0 |
| Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning | Sep 2, 2024 | Inference OptimizationLanguage Modeling | —Unverified | 0 |
| Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization | Jul 2, 2024 | Inference OptimizationSpeech Synthesis | —Unverified | 0 |
| SBbadger: Biochemical Reaction Networks with Definable Degree Distributions | Feb 25, 2022 | Inference Optimization | —Unverified | 0 |
| Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval | Jun 10, 2024 | Inference OptimizationInformation Retrieval | —Unverified | 0 |
| Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation | Jul 6, 2022 | Inference OptimizationMulti-Person Pose Estimation | —Unverified | 0 |
| SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition | Oct 4, 2019 | Inference Optimizationspeech-recognition | —Unverified | 0 |
| SySMOL: Co-designing Algorithms and Hardware for Neural Networks with Heterogeneous Precisions | Nov 23, 2023 | CPUGPU | —Unverified | 0 |
| The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries | Jun 14, 2025 | Bug fixingInference Optimization | —Unverified | 0 |
| Bayesian Active Learning in the Presence of Nuisance Parameters | Oct 23, 2023 | Active LearningExperimental Design | —Unverified | 0 |
| Investigations on the inference optimization techniques and their impact on multiple hardware platforms for Semantic Segmentation | Nov 29, 2019 | Inference OptimizationSemantic Segmentation | —Unverified | 0 |
| Input Convex Neural Networks | Sep 22, 2016 | ImputationInference Optimization | CodeCode Available | 0 |
| A General Method for Amortizing Variational Filtering | Nov 13, 2018 | Inference OptimizationVariational Inference | CodeCode Available | 0 |
| Iterative Amortized Inference | Jul 24, 2018 | Inference OptimizationVariational Inference | CodeCode Available | 0 |
| Brevity is the soul of sustainability: Characterizing LLM response lengths | Jun 10, 2025 | DecoderInference Optimization | CodeCode Available | 0 |
| LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Models | Oct 17, 2024 | Inference OptimizationNetwork Pruning | CodeCode Available | 0 |