SOTAVerified

Mixture-of-Experts

Papers

Showing 9511000 of 1312 papers

TitleStatusHype
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?0
M^3TN: Multi-gate Mixture-of-Experts based Multi-valued Treatment Network for Uplift Modeling0
Towards A Better Metric for Text-to-Video Generation0
Prompt-based mental health screening from social media text0
Robust Calibration For Improved Weather Prediction Under Distributional Shift0
Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models0
Subjective and Objective Analysis of Indian Social Media Video QualityCode0
k-Winners-Take-All Ensemble Neural NetworkCode0
Efficient Deweather Mixture-of-Experts with Uncertainty-aware Feature-wise Linear Modulation0
Agent4Ranking: Semantic Robust Ranking via Personalized Query Rewriting Using Multi-agent LLM0
Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning0
Generator Assisted Mixture of Experts For Feature Acquisition in Batch0
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape0
Online Action Recognition for Human Risk Prediction with Anticipated Haptic Alert via WearablesCode0
Training of Neural Networks with Uncertain Data: A Mixture of Experts Approach0
MoE-AMC: Enhancing Automatic Modulation Classification Performance Using Mixture-of-Experts0
MoEC: Mixture of Experts Implicit Neural Compression0
Language-driven All-in-one Adverse Weather Removal0
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts0
HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts0
Efficient Model Agnostic Approach for Implicit Neural Representation Based Arbitrary-Scale Image Super-Resolution0
Memory Augmented Language Models through Mixture of Word Experts0
Intentional Biases in LLM Responses0
CAME: Competitively Learning a Mixture-of-Experts Model for First-stage Retrieval0
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoECode0
Mixture-of-Experts for Open Set Domain Adaptation: A Dual-Space Detection Approach0
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts0
Manifold-Preserving Transformers are Effective for Short-Long Range EncodingCode0
Direct Neural Machine Translation with Task-level Mixture of Experts models0
Multi-view Contrastive Learning for Entity Typing over Knowledge GraphsCode0
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer0
Adaptive Gating in Mixture-of-Experts based Language Models0
Beyond the Typical: Modeling Rare Plausible Patterns in Chemical Reactions by Leveraging Sequential Mixture-of-Experts0
Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts ConversionCode0
Reinforcement Learning-based Mixture of Vision Transformers for Video Violence Recognition0
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness0
FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion ModelsCode0
Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts0
Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts0
Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectivesCode0
Task-Based MoE for Multitask Multilingual Machine Translation0
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget0
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE0
Beyond Sharing: Conflict-Aware Multivariate Time Series Anomaly DetectionCode0
FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs0
Experts Weights Averaging: A New General Training Scheme for Vision Transformers0
A Novel Temporal Multi-Gate Mixture-of-Experts Approach for Vehicle Trajectory and Driving Intention Prediction0
Uncertainty-Encoded Multi-Modal Fusion for Robust Object Detection in Autonomous Driving0
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing PlatformCode0
Bidirectional Attention as a Mixture of Continuous Word ExpertsCode0
Show:102550
← PrevPage 20 of 27Next →

No leaderboard results yet.