SOTAVerified

Mixture-of-Experts

Papers

Showing 501550 of 1312 papers

TitleStatusHype
BiPrompt-SAM: Enhancing Image Segmentation via Explicit Selection between Point and Text Prompts0
Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding0
ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses0
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM0
UniCoRN: Latent Diffusion-based Unified Controllable Image Restoration Network across Multiple Degradations0
Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts0
SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation0
Leveraging MoE-based Large Language Model for Zero-Shot Multi-Task Semantic Communication0
Core-Periphery Principle Guided State Space Model for Functional Connectome Classification0
MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts0
Fast filtering of non-Gaussian models using Amortized Optimal Transport MapsCode0
Adaptive Mixture of Low-Rank Experts for Robust Audio Spoofing Detection0
A Review of DeepSeek Models' Key Innovative Techniques0
MoLEx: Mixture of Layer Experts for Finetuning with Sparse UpcyclingCode0
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey0
dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis0
Astrea: A MOE-based Visual Understanding Model with Progressive Alignment0
FaVChat: Unlocking Fine-Grained Facail Video Understanding with Multimodal Large Language Models0
Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and AlignmentCode0
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference0
Double-Stage Feature-Level Clustering-Based Mixture of Experts Framework0
Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach0
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models0
MoE-Loco: Mixture of Experts for Multitask Locomotion0
MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models0
Accelerating MoE Model Inference with Expert Sharding0
GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts0
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference0
ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual RestorationCode0
MoFE: Mixture of Frozen Experts Architecture0
Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba ModelsCode0
MANDARIN: Mixture-of-Experts Framework for Dynamic Delirium and Coma Prediction in ICU Patients: Development and Validation of an Acute Brain Dysfunction Prediction Model0
A Novel Trustworthy Video Summarization Algorithm Through a Mixture of LoRA Experts0
MoEMoE: Question Guided Dense and Scalable Sparse Mixture-of-Expert for Multi-source Multi-modal Answering0
FMT:A Multimodal Pneumonia Detection Model Based on Stacking MOE Framework0
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs0
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts0
Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning0
TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster0
Continual Pre-training of MoEs: How robust is your router?0
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery0
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining0
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling0
Convergence Rates for Softmax Gating Mixture of Experts0
BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification0
VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology DetectionCode0
Tabby: Tabular Data Synthesis with Language Models0
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed TransformerCode0
How Do Consumers Really Choose: Exposing Hidden Preferences with the Mixture of Experts Model0
PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation0
Show:102550
← PrevPage 11 of 27Next →

No leaderboard results yet.