SOTAVerified

Mixture-of-Experts

Papers

Showing 11511175 of 1312 papers

TitleStatusHype
MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert SpecializationCode0
Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to Improve in ChessCode0
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors RoutingCode0
Subjective and Objective Analysis of Indian Social Media Video QualityCode0
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert MergingCode0
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds using Convolutional Neural NetworksCode0
Catching Attention with Automatic Pull Quote SelectionCode0
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware OptimizationCode0
DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing MechanismCode0
Hierarchical Mixture of Experts: Generalizable Learning for High-Level SynthesisCode0
A Mixture-of-Experts Model for Antonym-Synonym DiscriminationCode0
Hierarchical Deep Recurrent Architecture for Video UnderstandingCode0
Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba ModelsCode0
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time EstimationCode0
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task LearningCode0
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing PlatformCode0
MoE-I^2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank DecompositionCode0
Non-Normal Mixtures of ExpertsCode0
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-ExpertsCode0
Modality-Independent Brain Lesion Segmentation with Privacy-aware Continual LearningCode0
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert ModelsCode0
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language UnderstandingCode0
MLP-KAN: Unifying Deep Representation and Function LearningCode0
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-ExpertsCode0
Mixture of Nested Experts: Adaptive Processing of Visual TokensCode0
Show:102550
← PrevPage 47 of 53Next →

No leaderboard results yet.