SOTAVerified

Mixture-of-Experts

Papers

Showing 301350 of 1312 papers

TitleStatusHype
HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder ModelsCode1
Sparse MoEs meet Efficient EnsemblesCode1
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
Go Wider Instead of DeeperCode1
Heterogeneous Multi-task Learning with Expert DiversityCode1
Scaling Vision with Sparse Mixture of ExpertsCode1
RetGen: A Joint framework for Retrieval and Grounded Text Generation ModelingCode1
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of ExpertsCode1
MiCE: Mixture of Contrastive Experts for Unsupervised Image ClusteringCode1
Cross-Domain Label-Adaptive Stance DetectionCode1
VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of ExpertsCode1
Real-time Relevant Recommendation SuggestionCode1
Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-ExpertsCode1
PFL-MoE: Personalized Federated Learning Based on Mixture of ExpertsCode1
Multi-view Depth Estimation using Epipolar Spatio-Temporal NetworksCode1
Specialized federated learning using a mixture of expertsCode1
Transformer Based Multi-Source Domain AdaptationCode1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals PredictionCode1
Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian ProcessesCode1
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-ExpertsCode1
Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative ModelsCode1
MoËT: Mixture of Expert Trees and its Application to Verifiable Reinforcement LearningCode1
Gated Multimodal Units for Information FusionCode1
Distilling the Knowledge in a Neural NetworkCode1
GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous DrivingCode0
R^2MoE: Redundancy-Removal Mixture of Experts for Lifelong Concept LearningCode0
Mixture of Experts in Large Language Models0
Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive0
KAT-V1: Kwai-AutoThink Technical Report0
A Survey on Prompt TuningCode0
Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis0
What You Have is What You Track: Adaptive and Robust Multimodal TrackingCode0
Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen SubstrateCode0
Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach0
UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification0
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert MergingCode0
EVA: Mixture-of-Experts Semantic Variant Alignment for Compositional Zero-Shot Learning0
Little By Little: Continual Learning via Self-Activated Sparse Mixture-of-Rank Adaptive Learning0
Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-ExpertsCode0
Opportunistic Osteoporosis Diagnosis via Texture-Preserving Self-Supervision, Mixture of Experts and Multi-Task Integration0
Security Assessment of DeepSeek and GPT Series Models against Jailbreak Attacks0
An Audio-centric Multi-task Learning Framework for Streaming Ads Targeting on Spotify0
SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification0
Utility-Driven Speculative Decoding for Mixture-of-Experts0
Scaling Intelligence: Designing Data Centers for Next-Gen Language Models0
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs0
Exploring Speaker Diarization with Mixture of Experts0
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models0
Single-Example Learning in a Mixture of GPDMs with Latent Geometries0
Show:102550
← PrevPage 7 of 27Next →

No leaderboard results yet.