| QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning | Aug 20, 2024 | BenchmarkingLanguage Modelling | —Unverified | 0 |
| Benchmarking Large Language Models for Math Reasoning Tasks | Aug 20, 2024 | BenchmarkingIn-Context Learning | CodeCode Available | 0 |
| A Study of PHOC Spatial Region Configurations for Math Formula Retrieval | Aug 17, 2024 | MathRetrieval | —Unverified | 0 |
| Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions | Aug 16, 2024 | DescriptiveHallucination | —Unverified | 0 |
| Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models | Aug 15, 2024 | Math | —Unverified | 0 |
| Leveraging Web-Crawled Data for High-Quality Fine-Tuning | Aug 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Aug 14, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition | Aug 13, 2024 | Common Sense ReasoningMath | —Unverified | 0 |
| P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for data pruning in LLM Training | Aug 10, 2024 | DiversityLogical Reasoning | —Unverified | 0 |
| Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil | Aug 9, 2024 | MathMultiple-choice | —Unverified | 0 |