SOTAVerified

Mixture-of-Experts

Papers

Showing 351400 of 1312 papers

TitleStatusHype
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models0
NeuroMoE: A Transformer-Based Mixture-of-Experts Framework for Multi-Modal Neurological Disorder Classification0
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs0
Load Balancing Mixture of Experts with Similarity Preserving Routers0
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware OptimizationCode0
Serving Large Language Models on Huawei CloudMatrix3840
Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts0
GigaChat Family: Efficient Russian Language Modeling Through Mixture of Experts Architecture0
MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding0
A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling0
MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert SpecializationCode0
STAMImputer: Spatio-Temporal Attention MoE for Traffic Data ImputationCode0
MIRA: Medical Time Series Foundation Model for Real-World Health Data0
M2Restore: Mixture-of-Experts-based Mamba-CNN Fusion Framework for All-in-One Image Restoration0
MoE-GPS: Guidlines for Prediction Strategy for Dynamic Expert Duplication in MoE Load Balancing0
Breaking Data Silos: Towards Open and Scalable Mobility Foundation Models via Generative Continual Learning0
SMAR: Soft Modality-Aware Routing Strategy for MoE-based Multimodal Large Language Models Preserving Language Capabilities0
Lifelong Evolution: Collaborative Learning between Large and Small Language Models for Continuous Emergent Fake News Detection0
Brain-Like Processing Pathways Form in Models With Heterogeneous Experts0
Enhancing Multimodal Continual Instruction Tuning with BranchLoRA0
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis0
Mixture-of-Experts for Personalized and Semantic-Aware Next Location Prediction0
GradPower: Powering Gradients for Faster Language Model Pre-Training0
On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks0
From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering AgentsCode0
Two Is Better Than One: Rotations Scale LoRAs0
Revisiting Uncertainty Estimation and Calibration of Large Language Models0
Noise-Robustness Through Noise: Asymmetric LoRA Adaption with Poisoning Expert0
A Survey of Generative Categories and Techniques in Multimodal Large Language Models0
Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts0
A Human-Centric Approach to Explainable AI for Personalized EducationCode0
Advancing Expert Specialization for Better MoE0
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models0
ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation0
MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes0
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed EnvironmentsCode0
MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE0
NEXT: Multi-Grained Mixture of Experts via Text-Modulation for Multi-Modal Object Re-ID0
Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided GateCode0
RankLLM: A Python Package for Reranking with LLMs0
Integrating Dynamical Systems Learning with Foundational Models: A Meta-Evolutionary AI Framework for Clinical Trials0
μ-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts0
On Minimax Estimation of Parameters in Softmax-Contaminated Mixture of Experts0
Guiding the Experts: Semantic Priors for Efficient and Focused MoE RoutingCode0
Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter0
TrajMoE: Spatially-Aware Mixture of Experts for Unified Human Mobility Modeling0
EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media0
DualComp: End-to-End Learning of a Unified Dual-Modality Lossless Compressor0
DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving0
Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines0
Show:102550
← PrevPage 8 of 27Next →

No leaderboard results yet.