SOTAVerified

Mixture-of-Experts

Papers

Showing 401450 of 1312 papers

TitleStatusHype
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark DatasetCode2
UniPaint: Unified Space-time Video Inpainting via Mixture-of-Experts0
An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism0
Towards 3D Acceleration for low-power Mixture-of-Experts and Multi-Head Attention Spiking Transformers0
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of ExpertsCode1
RSUniVLM: A Unified Vision Language Model for Remote Sensing via Granularity-oriented Mixture of ExpertsCode1
Steps are all you need: Rethinking STEM Education with Prompt Engineering0
Monet: Mixture of Monosemantic Experts for TransformersCode2
Convolutional Neural Networks and Mixture of Experts for Intrusion Detection in 5G Networks and beyond0
Yi-Lightning Technical Report0
Mixture of Experts for Node Classification0
MQFL-FHE: Multimodal Quantum Federated Learning Framework with Fully Homomorphic Encryption0
HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting0
LaVIDE: A Language-Vision Discriminator for Detecting Changes in Satellite Image with Map References0
On the effectiveness of discrete representations in sparse mixture of experts0
Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference0
Complexity Experts are Task-Discriminative Learners for Any Image Restoration0
Mixture of Experts in Image Classification: What's the Sweet Spot?0
UOE: Unlearning One Expert Is Enough For Mixture-of-experts LLMS0
Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer PruningCode1
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection0
H^3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMsCode0
MH-MoE: Multi-Head Mixture-of-Experts0
LDACP: Long-Delayed Ad Conversions Prediction Model for Bidding Strategy0
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-TrainingCode2
Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts0
MERLOT: A Distilled LLM-based Mixture-of-Experts Framework for Scalable Encrypted Traffic Classification0
KAAE: Numerical Reasoning for Knowledge Graphs via Knowledge-aware Attributes Learning0
Ultra-Sparse Memory Network0
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese CharactersCode2
MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs0
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of ExpertsCode1
Weakly-Supervised Multimodal Learning on MIMIC-CXRCode0
Sparse Upcycling: Inference Inefficient Finetuning0
Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection0
Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach0
PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model0
Towards Vision Mixture of Experts for Wildlife Monitoring on the Edge0
Adaptive Conditional Expert Selection Network for Multi-domain Recommendation0
WDMoE: Wireless Distributed Mixture of Experts for Large Language Models0
NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts0
DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of ExpertsCode0
Advancing Robust Underwater Acoustic Target Recognition through Multi-task Learning and Multi-Gate Mixture-of-Experts0
FedMoE-DA: Federated Mixture of Experts via Domain Aware Fine-grained Aggregation0
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by TencentCode5
RS-MoE: Mixture of Experts for Remote Sensing Image Captioning and Visual Question Answering0
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference0
Facet-Aware Multi-Head Mixture-of-Experts Model for Sequential Recommendation0
PMoL: Parameter Efficient MoE for Preference Mixing of LLM Alignment0
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language ModelsCode2
Show:102550
← PrevPage 9 of 27Next →

No leaderboard results yet.