SOTAVerified

Inference Optimization

Papers

Showing 150 of 56 papers

TitleStatusHype
The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video SegmentationCode5
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RLCode3
A Survey on Inference Optimization Techniques for Mixture of Experts ModelsCode3
Inference Performance Optimization for Large Language Models on CPUsCode3
CycleBNN: Cyclic Precision Training in Binary Neural NetworksCode2
Painterly Image Harmonization using Diffusion ModelCode1
A Novel 1D State Space for Efficient Music Rhythmic AnalysisCode1
Adaptive Deep Neural Network Inference Optimization with EENetCode1
Easy and Efficient Transformer : Scalable Inference Solution For large NLP modelCode1
ADJUST: A Dictionary-Based Joint Reconstruction and Unmixing Method for Spectral TomographyCode1
DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis0
Easy and Efficient Transformer: Scalable Inference Solution For Large NLP Model0
EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge0
Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks0
Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification0
Faster MoE LLM Inference for Extremely Large Models0
Federated Learning While Providing Model as a Service: Joint Training and Inference Optimization0
FluidML: Fast and Memory Efficient Inference Optimization0
Hellinger-Kantorovich Gradient Flows: Global Exponential Decay of Entropy Functionals0
Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization0
Inference Optimization of Foundation Models on AI Accelerators0
The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities0
An approach to optimize inference of the DIART speaker diarization pipeline0
A bi-partite generative model framework for analyzing and simulating large scale multiple discrete-continuous travel behaviour data0
Advances and Open Challenges in Federated Foundation Models0
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition0
Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization0
Residual-Based Error Corrector Operator to Enhance Accuracy and Reliability of Neural Operator Surrogates of Nonlinear Variational Boundary-Value Problems0
CRVI: Convex Relaxation for Variational Inference0
Deep Signal Recovery with One-Bit Quantization0
Developing efficient transfer learning strategies for robust scene recognition in mobile robotics using pre-trained convolutional neural networks0
DSMentor: Enhancing Data Science Agents with Curriculum Learning and Online Knowledge Accumulation0
Learning to Infer0
Networked Signal and Information Processing0
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning0
Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization0
SBbadger: Biochemical Reaction Networks with Definable Degree Distributions0
Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval0
Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation0
SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition0
SySMOL: Co-designing Algorithms and Hardware for Neural Networks with Heterogeneous Precisions0
The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries0
Bayesian Active Learning in the Presence of Nuisance Parameters0
Investigations on the inference optimization techniques and their impact on multiple hardware platforms for Semantic Segmentation0
Input Convex Neural NetworksCode0
A General Method for Amortizing Variational FilteringCode0
Iterative Amortized InferenceCode0
Brevity is the soul of sustainability: Characterizing LLM response lengthsCode0
LLM-Rank: A Graph Theoretical Approach to Pruning Large Language ModelsCode0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.