SOTAVerified

Mixture-of-Experts

Papers

Showing 126150 of 1312 papers

TitleStatusHype
MDFEND: Multi-domain Fake News DetectionCode2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous DrivingCode2
Mixture of Lookup ExpertsCode2
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsCode2
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language ModelsCode2
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image AnalysisCode1
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of ExpertsCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA frameworkCode1
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation FrameworkCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Specialized federated learning using a mixture of expertsCode1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designCode1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals PredictionCode1
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-ExpertsCode1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language ModelsCode1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training QualityCode1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow PredictionCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
LOLA -- An Open-Source Massively Multilingual Large Language ModelCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
Show:102550
← PrevPage 6 of 53Next →

No leaderboard results yet.