SOTAVerified

Mixture-of-Experts

Papers

Showing 101150 of 1312 papers

TitleStatusHype
Demystifying the Compression of Mixture-of-Experts Through a Unified FrameworkCode2
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-ExpertsCode2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
Delta Decompression for MoE-based LLMs CompressionCode2
LiMoE: Mixture of LiDAR Representation Learners from Automotive ScenesCode2
Text2Human: Text-Driven Controllable Human Image GenerationCode2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time AdaptationCode2
Towards a Multimodal Large Language Model with Pixel-Level Insight for BiomedicineCode2
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-ExpertsCode2
No Language Left Behind: Scaling Human-Centered Machine TranslationCode2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language ModelsCode2
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEsCode2
Monet: Mixture of Monosemantic Experts for TransformersCode2
MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting ModelsCode2
Motion In-Betweening with Phase ManifoldsCode2
Higher Layers Need More LoRA ExpertsCode2
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsCode2
MoEUT: Mixture-of-Experts Universal TransformersCode2
Multi-Task Dense Prediction via Mixture of Low-Rank ExpertsCode2
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark DatasetCode2
MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery DetectionCode2
ModuleFormer: Modularity Emerges from Mixture-of-ExpertsCode2
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision TasksCode2
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-TuningCode2
Mixture of Tokens: Continuous MoE through Cross-Example AggregationCode2
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese CharactersCode2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet UpcyclingCode2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous DrivingCode2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
MDFEND: Multi-domain Fake News DetectionCode2
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language ModelsCode2
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image AnalysisCode1
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of ExpertsCode1
M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA frameworkCode1
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation FrameworkCode1
Specialized federated learning using a mixture of expertsCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designCode1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals PredictionCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
LOLA -- An Open-Source Massively Multilingual Large Language ModelCode1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training QualityCode1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow PredictionCode1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language ModelsCode1
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge ExcavationCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Show:102550
← PrevPage 3 of 27Next →

No leaderboard results yet.