SOTAVerified

Mixture-of-Experts

Papers

Showing 51100 of 1312 papers

TitleStatusHype
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-ExpertsCode3
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of ExpertsCode3
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts AdaptersCode3
MoAI: Mixture of All Intelligence for Large Language and Vision ModelsCode3
Scaling Laws for Fine-Grained Mixture of ExpertsCode3
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
BlackMamba: Mixture of Experts for State-Space ModelsCode3
MoE-Mamba: Efficient Selective State Space Models with Mixture of ExpertsCode3
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-ScalingCode3
MegaBlocks: Efficient Sparse Training with Mixture-of-ExpertsCode3
ST-MoE: Designing Stable and Transferable Sparse Expert ModelsCode3
MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting ModelsCode2
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-ExpertsCode2
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model InferenceCode2
I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-ExpertsCode2
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE InferenceCode2
Mixture of Lookup ExpertsCode2
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-ExpertsCode2
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization AlignmentCode2
Delta Decompression for MoE-based LLMs CompressionCode2
LiMoE: Mixture of LiDAR Representation Learners from Automotive ScenesCode2
Superposition in Transformers: A Novel Way of Building Mixture of ExpertsCode2
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU RoutingCode2
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-IdentificationCode2
Towards a Multimodal Large Language Model with Pixel-Level Insight for BiomedicineCode2
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark DatasetCode2
Monet: Mixture of Monosemantic Experts for TransformersCode2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-TrainingCode2
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese CharactersCode2
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language ModelsCode2
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image RestorationCode2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsCode2
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For FreeCode2
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-ExpertsCode2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language ModelsCode2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet UpcyclingCode2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous DrivingCode2
KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting?Code2
Mixture of A Million ExpertsCode2
A Closer Look into Mixture-of-Experts in Large Language ModelsCode2
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision TasksCode2
Demystifying the Compression of Mixture-of-Experts Through a Unified FrameworkCode2
Yuan 2.0-M32: Mixture of Experts with Attention RouterCode2
XTrack: Multimodal Training Boosts RGB-X Video Object TrackersCode2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time AdaptationCode2
MoEUT: Mixture-of-Experts Universal TransformersCode2
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer ModelsCode2
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One TokenCode2
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-ExpertsCode2
Show:102550
← PrevPage 2 of 27Next →

No leaderboard results yet.