SOTAVerified

Mixture-of-Experts

Papers

Showing 11011150 of 1312 papers

TitleStatusHype
Powering In-Database Dynamic Model Slicing for Structured Data Analytics0
Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining0
Predicting assisted ventilation in Amyotrophic Lateral Sclerosis using a mixture of experts and conformal predictors0
Prediction Sets for High-Dimensional Mixture of Experts Models0
Preferential Mixture-of-Experts: Interpretable Models that Rely on Human Expertise as much as Possible0
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding0
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference0
Probabilistic partition of unity networks for high-dimensional regression problems0
Probing the Robustness of Theory of Mind in Large Language Models0
ProMoE: Fast MoE-based LLM Serving using Proactive Caching0
Prompt-based mental health screening from social media text0
PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation0
PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets0
PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration0
Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity0
P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts0
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts0
QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration0
PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning0
Quality Resilient Deep Neural Networks0
Quantitative Stock Investment by Routing Uncertainty-Aware Trading Experts: A Multi-Task Learning Approach0
RankLLM: A Python Package for Reranking with LLMs0
Ray-Tracing for Conditionally Activated Neural Networks0
Realizing Video Summarization from the Path of Language-based Semantic Understanding0
Reasoning Beyond Limits: Advances and Open Problems for LLMs0
Recommending what video to watch next: a multitask ranking system0
ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction0
Regularized infill criteria for multi-objective Bayesian optimization with application to aircraft design0
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from DemonstrationsCode0
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed EnvironmentsCode0
Multi-modal Collaborative Optimization and Expansion Network for Event-assisted Single-eye Expression RecognitionCode0
Efficient and Interpretable Grammatical Error Correction with Mixture of ExpertsCode0
Multimodal Cultural Safety: Evaluation Frameworks and Alignment StrategiesCode0
Effective Approaches to Batch Parallelization for Dynamic Neural Network ArchitecturesCode0
Multimodal Fusion Strategies for Mapping Biophysical Landscape FeaturesCode0
Robust Traffic Forecasting against Spatial Shift over YearsCode0
Towards Rehearsal-Free Continual Relation Extraction: Capturing Within-Task Variance with Adaptive PromptingCode0
RoME: Role-aware Mixture-of-Expert Transformer for Text-to-Video RetrievalCode0
More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed RoutingCode0
RouterKT: Mixture-of-Experts for Knowledge TracingCode0
Multi-Source Domain Adaptation with Mixture of ExpertsCode0
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual DecodingCode0
Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and AlignmentCode0
MOoSE: Multi-Orientation Sharing Experts for Open-set Scene Text RecognitionCode0
A Bird's-eye View of Reranking: from List Level to Page LevelCode0
Hierarchical Mixtures of Generators for Adversarial LearningCode0
Multi-view Contrastive Learning for Entity Typing over Knowledge GraphsCode0
MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel OptimizationCode0
Mol-MoE: Training Preference-Guided Routers for Molecule GenerationCode0
MoLEx: Mixture of Layer Experts for Finetuning with Sparse UpcyclingCode0
Show:102550
← PrevPage 23 of 27Next →

No leaderboard results yet.