SOTAVerified

Mixture-of-Experts

Papers

Showing 451500 of 1312 papers

TitleStatusHype
MoxE: Mixture of xLSTM Experts with Entropy-Aware Routing for Efficient Language Modeling0
CICADA: Cross-Domain Interpretable Coding for Anomaly Detection and Adaptation in Multivariate Time Series0
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report GenerationCode0
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication0
PICO: Secure Transformers via Robust Prompt Isolation and Cybersecurity Oversight0
NoEsis: Differentially Private Knowledge Transfer in Modular LLM Adaptation0
Unveiling the Hidden: Movie Genre and User Bias in Spoiler DetectionCode0
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts0
MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core0
Multi-Type Context-Aware Conversational Recommender Systems via Mixture-of-Experts0
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering0
D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving0
Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data0
Unveiling Hidden Collaboration within Mixture-of-Experts in Large Language Models0
Plasticity-Aware Mixture of Experts for Learning Under QoE Shifts in Adaptive Video Streaming0
Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation0
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints0
RouterKT: Mixture-of-Experts for Knowledge TracingCode0
Regularized infill criteria for multi-objective Bayesian optimization with application to aircraft design0
Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network0
Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models0
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning0
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language ModelsCode0
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models0
FedMerge: Federated Personalization via Model Merging0
Finding Fantastic Experts in MoEs: A Unified Study for Expert Dropping Strategies and Observations0
HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs0
RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation0
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism0
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design0
A Unified Virtual Mixture-of-Experts Framework:Enhanced Inference and Hallucination Mitigation in Single-Model System0
DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing MechanismCode0
Detecting Financial Fraud with Hybrid Deep Learning: A Mix-of-Experts Approach to Sequential and Anomalous Patterns0
Unimodal-driven Distillation in Multimodal Emotion Recognition with Dynamic Fusion0
Mixture of Routers0
S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning0
Sparse Mixture of Experts as Unified Competitive Learning0
Beyond Standard MoE: Mixture of Latent Experts for Resource-Efficient Language Models0
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities0
RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts0
LLaVA-CMoE: Towards Continual Mixture of Experts for Large Vision-Language Models0
iMedImage Technical Report0
Reasoning Beyond Limits: Advances and Open Problems for LLMs0
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation0
Modality-Independent Brain Lesion Segmentation with Privacy-aware Continual LearningCode0
Optimal Scaling Laws for Efficiency Gains in a Theoretical Transformer-Augmented Sectional MoE Framework0
Enhancing Multi-modal Models with Heterogeneous MoE Adapters for Fine-tuning0
A multi-scale lithium-ion battery capacity prediction using mixture of experts and patch-based MLPCode0
M^2CD: A Unified MultiModal Framework for Optical-SAR Change Detection with Mixture of Experts and Self-Distillation0
Resilient Sensor Fusion under Adverse Sensor Failures via Multi-Modal Expert Fusion0
Show:102550
← PrevPage 10 of 27Next →

No leaderboard results yet.