SOTAVerified

Mixture-of-Experts

Papers

Showing 201225 of 1312 papers

TitleStatusHype
Norface: Improving Facial Expression Analysis by Identity NormalizationCode1
Swin SMT: Global Sequential Modeling in 3D Medical Image SegmentationCode1
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language ModelCode1
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsCode1
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-ExpertsCode1
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-ExpertsCode1
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model FusionCode1
DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of ExpertsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
MEFT: Memory-Efficient Fine-Tuning through Sparse AdapterCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
Graph Sparsification via Mixture of GraphsCode1
Mixture of Experts Meets Prompt-Based Continual LearningCode1
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-ContrastCode1
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
MeteoRA: Multiple-tasks Embedded LoRA for Large Language ModelsCode1
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of ExpertsCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
Revisiting RGBT Tracking Benchmarks from the Perspective of Modality Validity: A New Benchmark, Problem, and MethodCode1
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation FrameworkCode1
Swin2-MoSE: A New Single Image Super-Resolution Model for Remote SensingCode1
Large Multi-modality Model Assisted AI-Generated Image Quality AssessmentCode1
Multi-Head Mixture-of-ExpertsCode1
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-ExpertsCode1
Show:102550
← PrevPage 9 of 53Next →

No leaderboard results yet.