| Gemini: A Family of Highly Capable Multimodal Models | Dec 19, 2023 | 1 Image, 2*2 StitchingArithmetic Reasoning | CodeCode Available | 1 | 5 |
| Control LLM: Controlled Evolution for Intelligence Retention in LLM | Jan 19, 2025 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |
| Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark | Mar 26, 2025 | MMLUMultiple-choice | CodeCode Available | 1 | 5 |
| Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models | Oct 8, 2023 | MMLUNatural Language Understanding | CodeCode Available | 1 | 5 |
| MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports | May 16, 2025 | DiagnosticMath | CodeCode Available | 1 | 5 |
| ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization | Nov 22, 2023 | GPULanguage Modelling | CodeCode Available | 1 | 5 |
| Efficient Online Data Mixing For Language Model Pre-Training | Dec 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain | Apr 2, 2024 | Argument MiningDecision Making | CodeCode Available | 1 | 5 |
| Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs | Jul 5, 2024 | General KnowledgeInstruction Following | CodeCode Available | 1 | 5 |
| CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning | Oct 14, 2024 | MathMathematical Reasoning | CodeCode Available | 1 | 5 |