SOTAVerified

Mixture-of-Experts

Papers

Showing 801850 of 1312 papers

TitleStatusHype
Wolf: Captioning Everything with a World Summarization Framework0
Yi-Lightning Technical Report0
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning0
Zero-Resource Multilingual Model Transfer: Learning What to Share0
Multimodal Fusion and Coherence Modeling for Video Topic Segmentation0
HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction0
Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization0
Routing in Sparsely-gated Language Models responds to Context0
On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating0
A Fast Kernel-based Conditional Independence test with Application to Causal Discovery0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems0
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production0
A Survey of Generative Categories and Techniques in Multimodal Large Language Models0
3D Gaussian Splatting Data Compression with Mixture of Priors0
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow0
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication0
Accelerating MoE Model Inference with Expert Sharding0
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts0
Modular Action Concept Grounding in Semantic Video Prediction0
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction0
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs0
AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts0
Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection0
Adaptive Conditional Expert Selection Network for Multi-domain Recommendation0
Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network0
Adaptive Gating in Mixture-of-Experts based Language Models0
Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing0
Adaptive Mixture of Low-Rank Experts for Robust Audio Spoofing Detection0
Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective0
Adaptive Prompt: Unlocking the Power of Visual Prompt Tuning0
Adaptive Segmentation-Based Initialization for Steered Mixture of Experts Image Regression0
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style0
AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding0
Addressing Complex and Subjective Product-Related Queries with Customer Reviews0
ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels0
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings0
Advancing Expert Specialization for Better MoE0
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design0
Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts0
A Dynamic Approach to Stock Price Prediction: Comparing RNN and Mixture of Experts Models Across Different Volatility Profiles0
Affect in Tweets Using Experts Model0
A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery0
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts0
Agent4Ranking: Semantic Robust Ranking via Personalized Query Rewriting Using Multi-agent LLM0
AIREX: Neural Network-based Approach for Air Quality Inference in Unmonitored Cities0
A Large-scale Medical Visual Task Adaptation Benchmark0
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception0
Alternating Updates for Efficient Transformers0
AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction0
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks0
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.