Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 876–900 of 1312 papers

Title	Date	Tasks	Status	Hype
Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts	Nov 19, 2023	DiversityMixture-of-Experts	CodeCode Available	1
Memory Augmented Language Models through Mixture of Word Experts	Nov 15, 2023	Mixture-of-Experts	—Unverified	0
Intentional Biases in LLM Responses	Nov 11, 2023	Language ModelingLanguage Modelling	—Unverified	0
DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets	Nov 8, 2023	Mixture-of-Expertsobject-detection	CodeCode Available	1
CAME: Competitively Learning a Mixture-of-Experts Model for First-stage Retrieval	Nov 6, 2023	Mixture-of-ExpertsRetrieval	—Unverified	0
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE	Nov 5, 2023	DecoderMixture-of-Experts	CodeCode Available	0
Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach	Nov 1, 2023	Domain AdaptationMixture-of-Experts	—Unverified	0
SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models	Oct 29, 2023	GPUMixture-of-Experts	CodeCode Available	1
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models	Oct 25, 2023	GPUMixture-of-Experts	CodeCode Available	2
Mixture of Tokens: Continuous MoE through Cross-Example Aggregation	Oct 24, 2023	Language ModellingLarge Language Model	CodeCode Available	2
SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation	Oct 24, 2023	Code GenerationCode Translation	CodeCode Available	1
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts	Oct 22, 2023	Density EstimationMixture-of-Experts	—Unverified	0
Manifold-Preserving Transformers are Effective for Short-Long Range Encoding	Oct 22, 2023	Language ModelingLanguage Modelling	CodeCode Available	0
Direct Neural Machine Translation with Task-level Mixture of Experts models	Oct 18, 2023	Direct NMTLarge Language Model	—Unverified	0
Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs	Oct 18, 2023	Contrastive LearningEntity Typing	CodeCode Available	0
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach	Oct 18, 2023	Blind Super-ResolutionDecoder	CodeCode Available	1
Merging Experts into One: Improving Computational Efficiency of Mixture of Experts	Oct 15, 2023	Computational EfficiencyMixture-of-Experts	CodeCode Available	1
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer	Oct 15, 2023	DiversityMixture-of-Experts	—Unverified	0
Adaptive Gating in Mixture-of-Experts based Language Models	Oct 11, 2023	Mixture-of-Experts	—Unverified	0
Sparse Universal Transformer	Oct 11, 2023	Mixture-of-Experts	CodeCode Available	1
Beyond the Typical: Modeling Rare Plausible Patterns in Chemical Reactions by Leveraging Sequential Mixture-of-Experts	Oct 7, 2023	Mixture-of-Experts	—Unverified	0
Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion	Oct 6, 2023	Mixture-of-Experts	CodeCode Available	0
Reinforcement Learning-based Mixture of Vision Transformers for Video Violence Recognition	Oct 4, 2023	Mixture-of-Expertsreinforcement-learning	—Unverified	0
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness	Oct 3, 2023	GPUMachine Translation	—Unverified	0
FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models	Oct 3, 2023	Face TransferMixture-of-Experts	CodeCode Available	0

Show:10 25 50

← PrevPage 36 of 53Next →

No leaderboard results yet.