| Quadratic Gating Functions in Mixture of Experts: A Statistical Insight | Oct 15, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Scalable Multi-Domain Adaptation of Language Models using Modular Experts | Oct 14, 2024 | Domain AdaptationGeneral Knowledge | —Unverified | 0 |
| Learning to Ground VLMs without Forgetting | Oct 14, 2024 | DecoderLanguage Modelling | —Unverified | 0 |
| Ada-K Routing: Boosting the Efficiency of MoE-based LLMs | Oct 14, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RL | Oct 13, 2024 | Decision MakingMixture-of-Experts | —Unverified | 0 |
| MoIN: Mixture of Introvert Experts to Upcycle an LLM | Oct 13, 2024 | GPULanguage Modeling | —Unverified | 0 |
| GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks | Oct 12, 2024 | Mixture-of-Experts | —Unverified | 0 |
| AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach | Oct 12, 2024 | Mixture-of-ExpertsTask Planning | —Unverified | 0 |
| Upcycling Large Language Models into Mixture of Experts | Oct 10, 2024 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing | Oct 10, 2024 | image-classificationImage Classification | CodeCode Available | 0 |
| Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training | Oct 10, 2024 | Mixture-of-ExpertsVisual Question Answering | —Unverified | 0 |
| Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs | Oct 9, 2024 | Common Sense ReasoningMixture-of-Experts | —Unverified | 0 |
| Toward generalizable learning of all (linear) first-order methods via memory augmented Transformers | Oct 8, 2024 | AllMixture-of-Experts | —Unverified | 0 |
| Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large Language Models | Oct 8, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Probing the Robustness of Theory of Mind in Large Language Models | Oct 8, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Multimodal Fusion Strategies for Mapping Biophysical Landscape Features | Oct 7, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| Realizing Video Summarization from the Path of Language-based Semantic Understanding | Oct 6, 2024 | Mixture-of-ExpertsVideo Generation | —Unverified | 0 |
| Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs | Oct 4, 2024 | Contrastive LearningDenoising | —Unverified | 0 |
| A Dynamic Approach to Stock Price Prediction: Comparing RNN and Mixture of Experts Models Across Different Volatility Profiles | Oct 4, 2024 | Mixture-of-ExpertsStock Price Prediction | —Unverified | 0 |
| On Expert Estimation in Hierarchical Mixture of Experts: Beyond Softmax Gating Functions | Oct 3, 2024 | image-classificationImage Classification | —Unverified | 0 |
| Neutral residues: revisiting adapters for model extension | Oct 3, 2024 | Domain AdaptationLanguage Modelling | —Unverified | 0 |
| Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping | Oct 3, 2024 | GPUMixture-of-Experts | —Unverified | 0 |
| Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts | Oct 3, 2024 | Mixture-of-Expertsparameter estimation | CodeCode Available | 0 |
| MLP-KAN: Unifying Deep Representation and Function Learning | Oct 3, 2024 | Kolmogorov-Arnold NetworksMixture-of-Experts | CodeCode Available | 0 |
| The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs | Oct 2, 2024 | BenchmarkingHallucination | —Unverified | 0 |