SOTAVerified

Mixture-of-Experts

Papers

Showing 301325 of 1312 papers

TitleStatusHype
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-ExpertsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Multi-Task Reinforcement Learning with Mixture of Orthogonal ExpertsCode1
Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic ForecastingCode1
Emergent Modularity in Pre-trained TransformersCode1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Multi-Head Mixture-of-ExpertsCode1
Multilinear Mixture of Experts: Scalable Expert Specialization through FactorizationCode1
Go Wider Instead of DeeperCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of AdaptersCode1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder ApproachCode1
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation ModelCode1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse GateCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
Multimodal Clinical Trial Outcome Prediction with Large Language ModelsCode1
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural NetworksCode1
Layerwise Recurrent Router for Mixture-of-ExpertsCode1
Sequence-level Semantic Representation Fusion for Recommender SystemsCode1
Show:102550
← PrevPage 13 of 53Next →

No leaderboard results yet.