SOTAVerified

Mixture-of-Experts

Papers

Showing 51100 of 1312 papers

TitleStatusHype
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
BlackMamba: Mixture of Experts for State-Space ModelsCode3
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts AdaptersCode3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge DistillationCode3
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance FieldsCode3
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem AugmentationCode3
A Survey on Mixture of ExpertsCode3
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-ScalingCode3
Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward ModelCode3
A Survey on Inference Optimization Techniques for Mixture of Experts ModelsCode3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-ExpertsCode3
Demystifying the Compression of Mixture-of-Experts Through a Unified FrameworkCode2
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-IdentificationCode2
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter ModelsCode2
Delta Decompression for MoE-based LLMs CompressionCode2
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction TuningCode2
Harder Tasks Need More Experts: Dynamic Routing in MoE ModelsCode2
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts LayerCode2
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language ModelsCode2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark DatasetCode2
No Language Left Behind: Scaling Human-Centered Machine TranslationCode2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language ModelsCode2
Motion In-Betweening with Phase ManifoldsCode2
Multi-Task Dense Prediction via Mixture of Low-Rank ExpertsCode2
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-ExpertsCode2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time AdaptationCode2
Monet: Mixture of Monosemantic Experts for TransformersCode2
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU RoutingCode2
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-ExpertsCode2
MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery DetectionCode2
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese CharactersCode2
ModuleFormer: Modularity Emerges from Mixture-of-ExpertsCode2
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsCode2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet UpcyclingCode2
Mixture of A Million ExpertsCode2
Mixture of Lookup ExpertsCode2
MDFEND: Multi-domain Fake News DetectionCode2
Fast Feedforward NetworksCode2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous DrivingCode2
Mixture of Tokens: Continuous MoE through Cross-Example AggregationCode2
MoEUT: Mixture-of-Experts Universal TransformersCode2
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision TasksCode2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-TrainingCode2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsCode2
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-ExpertsCode2
LiMoE: Mixture of LiDAR Representation Learners from Automotive ScenesCode2
Learning A Sparse Transformer Network for Effective Image DerainingCode2
A Closer Look into Mixture-of-Experts in Large Language ModelsCode2
Show:102550
← PrevPage 2 of 27Next →

No leaderboard results yet.