SOTAVerified

Mixture-of-Experts

Papers

Showing 251275 of 1312 papers

TitleStatusHype
Multi-Task Reinforcement Learning with Mixture of Orthogonal ExpertsCode1
DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasetsCode1
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts ModelsCode1
SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code TranslationCode1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder ApproachCode1
Merging Experts into One: Improving Computational Efficiency of Mixture of ExpertsCode1
Sparse Universal TransformerCode1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language ModelsCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert InferenceCode1
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-ExpertsCode1
HyperFormer: Enhancing Entity and Relation Interaction for Hyper-Relational Knowledge Graph CompletionCode1
MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language ModelsCode1
Deep learning techniques for blind image super-resolution: A high-scale multi-domain perspective evaluationCode1
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision TransformerCode1
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural NetworksCode1
COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local SearchCode1
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-ExpertsCode1
Emergent Modularity in Pre-trained TransformersCode1
Lifting the Curse of Capacity Gap in Distilling Language ModelsCode1
Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data IntegrationCode1
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge ExcavationCode1
Re-IQA: Unsupervised Learning for Image Quality Assessment in the WildCode1
Show:102550
← PrevPage 11 of 53Next →

No leaderboard results yet.