SOTAVerified

Mixture-of-Experts

Papers

Showing 11511175 of 1312 papers

TitleStatusHype
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning0
Automatic Document Sketching: Generating Drafts from Analogous Texts0
Scaling Vision with Sparse Mixture of ExpertsCode1
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task LearningCode0
AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding0
GEMNET: Effective Gated Gazetteer Representations for Recognizing Complex Entities in Low-context Input0
M6-T: Exploring Sparse Expert Models and Beyond0
Data Expansion using Back Translation and Paraphrasing for Hate Speech Detection0
Mixture of ELM based experts with trainable gating network0
Generalizable Person Re-identification with Relevance-aware Mixture of Experts0
RetGen: A Joint framework for Retrieval and Grounded Text Generation ModelingCode1
MTNet: A Multi-Task Neural Network for On-Field Calibration of Low-Cost Air Monitoring Sensors0
KDExplainer: A Task-oriented Attention Model for Explaining Knowledge Distillation0
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of ExpertsCode1
MiCE: Mixture of Contrastive Experts for Unsupervised Image ClusteringCode1
Robust Federated Learning by Mixture of ExpertsCode0
Probabilistic Rainfall Estimation from Automotive LidarCode0
Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement LearningCode0
Non-asymptotic model selection in block-diagonal mixture of polynomial experts models0
Cross-Domain Label-Adaptive Stance DetectionCode1
A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts modelsCode0
Multi-GAT: A Graphical Attention-based Hierarchical Multimodal Representation Learning Approach for Human Activity Recognition0
Cross-Topic Rumor Detection using Topic-Mixtures0
Imitation Learning from MPC for Quadrupedal Multi-Gait Control0
VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of ExpertsCode1
Show:102550
← PrevPage 47 of 53Next →

No leaderboard results yet.