Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 1312 papers

Title	Date	Tasks	Status	Hype
Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts	Nov 19, 2023	DiversityMixture-of-Experts	CodeCode Available	1
DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets	Nov 8, 2023	Mixture-of-Expertsobject-detection	CodeCode Available	1
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models	Oct 29, 2023	GPUMixture-of-Experts	CodeCode Available	1
SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation	Oct 24, 2023	Code GenerationCode Translation	CodeCode Available	1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach	Oct 18, 2023	Blind Super-ResolutionDecoder	CodeCode Available	1
Merging Experts into One: Improving Computational Efficiency of Mixture of Experts	Oct 15, 2023	Computational EfficiencyMixture-of-Experts	CodeCode Available	1
Sparse Universal Transformer	Oct 11, 2023	Mixture-of-Experts	CodeCode Available	1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy	Oct 2, 2023	Mixture-of-Experts	CodeCode Available	1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection	Sep 26, 2023	Instance SegmentationMixture-of-Experts	CodeCode Available	1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models	Sep 25, 2023	GPUMixture-of-Experts	CodeCode Available	1
Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis	Sep 7, 2023	Image GenerationMixture-of-Experts	CodeCode Available	1
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference	Aug 23, 2023	CPUGPU	CodeCode Available	1
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts	Aug 22, 2023	Mixture-of-ExpertsNeRF	CodeCode Available	1
HyperFormer: Enhancing Entity and Relation Interaction for Hyper-Relational Knowledge Graph Completion	Aug 12, 2023	AttributeKnowledge Graph Completion	CodeCode Available	1
MLP Fusion: Towards Efficient Fine-tuning of Dense and Mixture-of-Experts Language Models	Jul 18, 2023	Language ModellingMixture-of-Experts	CodeCode Available	1
Deep learning techniques for blind image super-resolution: A high-scale multi-domain perspective evaluation	Jun 15, 2023	Image Quality AssessmentImage Super-Resolution	CodeCode Available	1
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer	Jun 10, 2023	Efficient ViTsMixture-of-Experts	CodeCode Available	1
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks	Jun 7, 2023	Mixture-of-Experts	CodeCode Available	1
COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search	Jun 5, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts	May 30, 2023	CPUGPU	CodeCode Available	1
Emergent Modularity in Pre-trained Transformers	May 28, 2023	Mixture-of-Experts	CodeCode Available	1
Lifting the Curse of Capacity Gap in Distilling Language Models	May 20, 2023	Knowledge DistillationMixture-of-Experts	CodeCode Available	1
Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration	May 1, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation	Apr 3, 2023	Mixture-of-ExpertsTransfer Learning	CodeCode Available	1
Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild	Apr 2, 2023	Image Quality AssessmentMixture-of-Experts	CodeCode Available	1
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering	Mar 2, 2023	Mixture-of-ExpertsQuestion Answering	CodeCode Available	1
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers	Mar 2, 2023	Mixture-of-Experts	CodeCode Available	1
Mixture of Decision Trees for Interpretable Machine Learning	Nov 26, 2022	Interpretable Machine LearningMixture-of-Experts	CodeCode Available	1
Spatial Mixture-of-Experts	Nov 24, 2022	Mixture-of-Experts	CodeCode Available	1
PAD-Net: An Efficient Framework for Dynamic Networks	Nov 10, 2022	image-classificationImage Classification	CodeCode Available	1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design	Oct 26, 2022	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation	Oct 14, 2022	CPUMachine Translation	CodeCode Available	1
Mixture of Attention Heads: Selecting Attention Heads Per Token	Oct 11, 2022	Computational EfficiencyLanguage Modeling	CodeCode Available	1
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts	Oct 8, 2022	Domain GeneralizationKnowledge Distillation	CodeCode Available	1
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries	Aug 16, 2022	Mixture-of-Experts	CodeCode Available	1
Towards Understanding Mixture of Experts in Deep Learning	Aug 4, 2022	Deep LearningMixture-of-Experts	CodeCode Available	1
Learning Soccer Juggling Skills with Layer-wise Mixture-of-Experts	Jul 24, 2022	Deep Reinforcement LearningHumanoid Control	CodeCode Available	1
Sparse Mixture-of-Experts are Domain Generalizable Learners	Jun 8, 2022	Domain GeneralizationMixture-of-Experts	CodeCode Available	1
Patcher: Patch Transformers with Mixture of Experts for Precise Medical Image Segmentation	Jun 3, 2022	DecoderImage Segmentation	CodeCode Available	1
Addressing Confounding Feature Issue for Causal Recommendation	May 13, 2022	Mixture-of-ExpertsRecommendation Systems	CodeCode Available	1
StableMoE: Stable Routing Strategy for Mixture of Experts	Apr 18, 2022	Language ModelingLanguage Modelling	CodeCode Available	1
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation	Apr 15, 2022	Knowledge DistillationMixture-of-Experts	CodeCode Available	1
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition	Apr 7, 2022	Mixture-of-Expertsspeech-recognition	CodeCode Available	1
Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution	Mar 27, 2022	Image Super-ResolutionMixture-of-Experts	CodeCode Available	1
SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization	Mar 13, 2022	Abstractive Text SummarizationDocument Summarization	CodeCode Available	1
Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models	Mar 2, 2022	Language ModelingLanguage Modelling	CodeCode Available	1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate	Dec 29, 2021	Language ModelingLanguage Modelling	CodeCode Available	1
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification	Dec 16, 2021	Generalizable Person Re-identificationMixture-of-Experts	CodeCode Available	1
Unsupervised Foreground Extraction via Deep Region Competition	Oct 29, 2021	Image SegmentationInductive Bias	CodeCode Available	1
HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models	Oct 8, 2021	Abstractive Text SummarizationDecoder	CodeCode Available	1

Show:10 25 50

← PrevPage 6 of 27Next →

No leaderboard results yet.