SOTAVerified

Mixture-of-Experts

Papers

Showing 751800 of 1312 papers

TitleStatusHype
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging0
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing0
UniAdapt: A Universal Adapter for Knowledge Calibration0
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards0
Robust Traffic Forecasting against Spatial Shift over YearsCode0
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning0
IDEA: An Inverse Domain Expert Adaptation Based Active DNN IP Protection Method0
SciDFM: A Large Language Model with Mixture-of-Experts for Science0
Toward Mixture-of-Experts Enabled Trustworthy Semantic Communication for 6G Networks0
Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM0
Leveraging Mixture of Experts for Improved Speech Deepfake Detection0
Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond0
A Gated Residual Kolmogorov-Arnold Networks for Mixtures of ExpertsCode0
Routing in Sparsely-gated Language Models responds to Context0
Multi-omics data integration for early diagnosis of hepatocellular carcinoma (HCC) using machine learning0
On-Device Collaborative Language Modeling via a Mixture of Generalists and SpecialistsCode0
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts0
Mixture of Diverse Size Experts0
GRIN: GRadient-INformed MoE0
LPT++: Efficient Training on Mixture of Long-tailed Experts0
Adaptive Segmentation-Based Initialization for Steered Mixture of Experts Image Regression0
Integrating AI's Carbon Footprint into Risk Management Frameworks: Strategies and Tools for Sustainable Compliance in Banking Sector0
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning0
VE: Modeling Multivariate Time Series Correlation with Variate EmbeddingCode0
DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models0
Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection0
Interpretable mixture of experts for time series prediction under recurrent and non-recurrent conditions0
Pluralistic Salient Object Detection0
Configurable Foundation Models: Building LLMs from a Modular Perspective0
Enhancing Code-Switching Speech Recognition with LID-Based Collaborative Mixture of Experts Model0
Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching0
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts0
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts0
Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts0
Parameter-Efficient Quantized Mixture-of-Experts Meets Vision-Language Instruction Tuning for Semiconductor Electron Micrograph Analysis0
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings0
La-SoftMoE CLIP for Unified Physical-Digital Face Attack Detection0
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time EstimationCode0
The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities0
Multi-Treatment Multi-Task Uplift Modeling for Enhancing User Growth0
SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging0
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful ComparatorsCode0
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors RoutingCode0
FedMoE: Personalized Federated Learning via Heterogeneous Mixture of Experts0
HMoE: Heterogeneous Mixture of Experts for Language Modeling0
A Unified Framework for Iris Anti-Spoofing: Introducing IrisGeneral Dataset and Masked-MoE Method0
FEDKIM: Adaptive Federated Knowledge Injection into Medical Foundation ModelsCode0
Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality DetectionCode0
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language ModelsCode0
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts0
Show:102550
← PrevPage 16 of 27Next →

No leaderboard results yet.