SOTAVerified

Mixture-of-Experts

Papers

Showing 151175 of 1312 papers

TitleStatusHype
PICO: Secure Transformers via Robust Prompt Isolation and Cybersecurity Oversight0
NoEsis: Differentially Private Knowledge Transfer in Modular LLM Adaptation0
Unveiling the Hidden: Movie Genre and User Bias in Spoiler DetectionCode0
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts0
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated ImagesCode1
MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core0
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identificationCode1
HAECcity: Open-Vocabulary Scene Understanding of City-Scale Point Clouds with Superpoint Graph Clustering0
Multi-Type Context-Aware Conversational Recommender Systems via Mixture-of-Experts0
D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving0
Unveiling Hidden Collaboration within Mixture-of-Experts in Large Language Models0
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data0
Plasticity-Aware Mixture of Experts for Learning Under QoE Shifts in Adaptive Video Streaming0
Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation0
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints0
RouterKT: Mixture-of-Experts for Knowledge TracingCode0
Regularized infill criteria for multi-objective Bayesian optimization with application to aircraft design0
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning0
Kimi-VL Technical ReportCode5
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-MixingCode1
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language ModelsCode0
Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models0
Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network0
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models0
Show:102550
← PrevPage 7 of 53Next →

No leaderboard results yet.