SOTAVerified

Mixture-of-Experts

Papers

Showing 751800 of 1312 papers

TitleStatusHype
Towards Vision Mixture of Experts for Wildlife Monitoring on the Edge0
Training-efficient density quantum machine learning0
Training of Neural Networks with Uncertain Data: A Mixture of Experts Approach0
TrajMoE: Spatially-Aware Mixture of Experts for Unified Human Mobility Modeling0
Transformer Layer Injection: A Novel Approach for Efficient Upscaling of Large Language Models0
Tree-gated Deep Mixture-of-Experts For Pose-robust Face Alignment0
Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data0
Towards Incremental Learning in Large Language Models: A Critical Review0
True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics0
TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster0
Tuning of Mixture-of-Experts Mixed-Precision Neural Networks0
Turn Waste into Worth: Rectifying Top-k Router of MoE0
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training0
Two Is Better Than One: Rotations Scale LoRAs0
U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF0
UGG-ReID: Uncertainty-Guided Graph Model for Multi-Modal Object Re-Identification0
Fast Deep Mixtures of Gaussian Process Experts0
Ultra-Sparse Memory Network0
UME: Upcycling Mixture-of-Experts for Scalable and Efficient Automatic Speech Recognition0
UMoE: Unifying Attention and FFN with Shared Experts0
Unbiased Gradient Estimation with Balanced Assignments for Mixtures of Experts0
Uncertainty-Aware Driver Trajectory Prediction at Urban Intersections0
Uncertainty-Encoded Multi-Modal Fusion for Robust Object Detection in Autonomous Driving0
Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts0
UniAdapt: A Universal Adapter for Knowledge Calibration0
UNIALIGN: Scaling Multimodal Alignment within One Unified Model0
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook0
UniCoRN: Latent Diffusion-based Unified Controllable Image Restoration Network across Multiple Degradations0
Unified Modeling of Multi-Domain Multi-Device ASR Systems0
Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting0
Unimodal-driven Distillation in Multimodal Emotion Recognition with Dynamic Fusion0
UniPaint: Unified Space-time Video Inpainting via Mixture-of-Experts0
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models0
UniUIR: Considering Underwater Image Restoration as An All-in-One Learner0
Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts0
Unveiling Hidden Collaboration within Mixture-of-Experts in Large Language Models0
UOE: Unlearning One Expert Is Enough For Mixture-of-experts LLMS0
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging0
Upcycling Large Language Models into Mixture of Experts0
Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC0
Utility-Driven Speculative Decoding for Mixture-of-Experts0
Vanilla Transformers are Transfer Capability Teachers0
Variational Distillation of Diffusion Policies into Mixture of Experts0
Variational Mixture of Gaussian Process Experts0
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts0
Visual Saliency Prediction Using a Mixture of Deep Neural Networks0
WDMoE: Wireless Distributed Large Language Models with Mixture of Experts0
WDMoE: Wireless Distributed Mixture of Experts for Large Language Models0
WeNet: Weighted Networks for Recurrent Network Architecture Search0
Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production0
Show:102550
← PrevPage 16 of 27Next →

No leaderboard results yet.