| MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts | Oct 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets | Oct 10, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Ocean-OCR: Towards General OCR Application via a Vision-Language Model | Jan 26, 2025 | document understandingLanguage Modeling | CodeCode Available | 1 | 5 |
| SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient | Jan 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning | May 10, 2021 | Image Paragraph CaptioningLanguage Modeling | CodeCode Available | 0 | 5 |
| Calibrating LLM-Based Evaluator | Sep 23, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 0 | 5 |
| MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning | Feb 27, 2024 | 8kLanguage Modeling | CodeCode Available | 0 | 5 |
| Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More | Feb 11, 2025 | DecoderInformation Retrieval | CodeCode Available | 0 | 5 |
| Agentic Society: Merging skeleton from real world and texture from Large Language Model | Sep 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Mask-Free Neuron Concept Annotation for Interpreting Neural Networks in Medical Domain | Jul 16, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 0 | 5 |