| Routoo: Learning to Route to Large Language Models Effectively | Jan 25, 2024 | MMLUMulti-task Language Understanding | CodeCode Available | 2 |
| LLaMA Beyond English: An Empirical Study on Language Capability Transfer | Jan 2, 2024 | GPUInformativeness | —Unverified | 0 |
| Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities | Dec 22, 2023 | ChatbotGSM8K | —Unverified | 0 |
| Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning | Dec 22, 2023 | Instruction FollowingMixture-of-Experts | CodeCode Available | 2 |
| YAYI 2: Multilingual Open-Source Large Language Models | Dec 22, 2023 | MMLU | —Unverified | 0 |
| Gemini: A Family of Highly Capable Multimodal Models | Dec 19, 2023 | 1 Image, 2*2 StitchingArithmetic Reasoning | CodeCode Available | 1 |
| EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models | Dec 11, 2023 | BenchmarkingEmotional Intelligence | CodeCode Available | 2 |
| Efficient Online Data Mixing For Language Model Pre-Training | Dec 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Prompt Optimization via Adversarial In-Context Learning | Dec 5, 2023 | Arithmetic ReasoningData-to-Text Generation | CodeCode Available | 1 |
| ArcMMLU: A Library and Information Science Benchmark for Large Language Models | Nov 30, 2023 | MMLU | CodeCode Available | 1 |