SOTAVerified

Mixture-of-Experts

Papers

Showing 11511200 of 1312 papers

TitleStatusHype
MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert SpecializationCode0
Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to Improve in ChessCode0
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors RoutingCode0
Subjective and Objective Analysis of Indian Social Media Video QualityCode0
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert MergingCode0
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds using Convolutional Neural NetworksCode0
Catching Attention with Automatic Pull Quote SelectionCode0
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware OptimizationCode0
DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing MechanismCode0
Hierarchical Mixture of Experts: Generalizable Learning for High-Level SynthesisCode0
A Mixture-of-Experts Model for Antonym-Synonym DiscriminationCode0
Hierarchical Deep Recurrent Architecture for Video UnderstandingCode0
Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba ModelsCode0
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time EstimationCode0
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task LearningCode0
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing PlatformCode0
MoE-I^2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank DecompositionCode0
Non-Normal Mixtures of ExpertsCode0
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-ExpertsCode0
Modality-Independent Brain Lesion Segmentation with Privacy-aware Continual LearningCode0
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert ModelsCode0
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language UnderstandingCode0
MLP-KAN: Unifying Deep Representation and Function LearningCode0
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-ExpertsCode0
Mixture of Nested Experts: Adaptive Processing of Visual TokensCode0
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoECode0
Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic PerspectiveCode0
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language ModelsCode0
Discontinuity-Sensitive Optimal Control Learning by Mixture of ExpertsCode0
H^3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMsCode0
A Survey on Prompt TuningCode0
On-Device Collaborative Language Modeling via a Mixture of Generalists and SpecialistsCode0
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace TheoryCode0
AskChart: Universal Chart Understanding through Textual EnhancementCode0
GuiLoMo: Allocating Expert Number and Rank for LoRA-MoE via Bilevel Optimization with GuidedSelection VectorsCode0
Guiding the Experts: Semantic Priors for Efficient and Focused MoE RoutingCode0
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-ExpertsCode0
Online Action Recognition for Human Risk Prediction with Anticipated Haptic Alert via WearablesCode0
Table-based Fact Verification with Self-adaptive Mixture of ExpertsCode0
VE: Modeling Multivariate Time Series Correlation with Variate EmbeddingCode0
GShard: Scaling Giant Models with Conditional Computation and Automatic ShardingCode0
Deep Mixture of Experts via Shallow EmbeddingCode0
Build a Robust QA System with Transformer-based Mixture of ExpertsCode0
TAMER: A Test-Time Adaptive MoE-Driven Framework for EHR Representation LearningCode0
DESIRE-ME: Domain-Enhanced Supervised Information REtrieval using Mixture-of-ExpertsCode0
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI ScaleCode0
SEKE: Specialised Experts for Keyword ExtractionCode0
Mixture of Link Predictors on GraphsCode0
Mixture-of-Experts Variational Autoencoder for Clustering and Generating from Similarity-Based Representations on Single Cell DataCode0
Opponent Modeling in Deep Reinforcement LearningCode0
Show:102550
← PrevPage 24 of 27Next →

No leaderboard results yet.