| Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data | Feb 24, 2023 | Arithmetic ReasoningLanguage Modelling | CodeCode Available | 1 |
| MoT: Memory-of-Thought Enables ChatGPT to Self-Improve | May 9, 2023 | Arithmetic ReasoningNatural Language Inference | CodeCode Available | 1 |
| Gemini: A Family of Highly Capable Multimodal Models | Dec 19, 2023 | 1 Image, 2*2 StitchingArithmetic Reasoning | CodeCode Available | 1 |
| Generative Parameter-Efficient Fine-Tuning | Dec 1, 2023 | Arithmetic ReasoningFine-Grained Image Classification | CodeCode Available | 1 |
| Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs | Jun 22, 2023 | Arithmetic ReasoningBenchmarking | CodeCode Available | 1 |
| Empirical Study of Zero-Shot NER with ChatGPT | Oct 16, 2023 | Arithmetic Reasoningnamed-entity-recognition | CodeCode Available | 1 |
| HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems | May 17, 2025 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles | Jun 18, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |
| Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure | Apr 2, 2025 | Arithmetic ReasoningData Augmentation | CodeCode Available | 1 |
| A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration | Oct 3, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 1 |