SOTAVerified

Mixture-of-Experts

Papers

Showing 831840 of 1312 papers

TitleStatusHype
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers0
How Lightweight Can A Vision Transformer Be0
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines0
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought0
HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models0
Hypertext Entity Extraction in Webpage0
IDEA: An Inverse Domain Expert Adaptation Based Active DNN IP Protection Method0
Identifying Shopping Intent in Product QA for Proactive Recommendations0
iMedImage Technical Report0
Imitation Learning from MPC for Quadrupedal Multi-Gait Control0
Show:102550
← PrevPage 84 of 132Next →

No leaderboard results yet.