SOTAVerified

Math

Papers

Showing 851900 of 1596 papers

TitleStatusHype
Strictly monotone mean-variance preferences with applications to portfolio selection0
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models0
LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Tasks0
CoinMath: Harnessing the Power of Coding Instruction for Math LLMsCode0
A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges0
Combining Large Language Models with Tutoring System Intelligence: A Case Study in Caregiver Homework SupportCode0
Dipper: Diversity in Prompts for Producing Large Language Model Ensembles in Reasoning tasks0
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning0
Learning to Solve Domain-Specific Calculation Problems with Knowledge-Intensive Programs Generator0
A Context-Enhanced Framework for Sequential Graph ReasoningCode0
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions0
Greek2MathTex: A Greek Speech-to-Text Framework for LaTeX Equations GenerationCode0
MNIST-Fraction: Enhancing Math Education with AI-Driven Fraction Detection and Analysis0
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM EvaluationCode0
When Dimensionality Reduction Meets Graph (Drawing) Theory: Introducing a Common Framework, Challenges and Opportunities0
Mining Math Conjectures from LLMs: A Pruning Approach0
Chimera: Improving Generalist Model with Domain-Specific Experts0
Neuro-Symbolic Data Generation for Math Reasoning0
Hard Math -- Easy UVM: Pragmatic solutions for verifying hardware algorithms using UVM0
Automated LaTeX Code Generation from Handwritten Math Expressions Using Vision Transformer0
Enhancing Mathematical Reasoning in LLMs with Background Operators0
RedStone: Curating General, Code, Math, and QA Data for Large Language Models0
Unsupervised learning-based calibration scheme for Rough Bergomi modelCode0
MALT: Improving Reasoning with Multi-Agent LLM Training0
Yi-Lightning Technical Report0
Reverse Thinking Makes LLMs Stronger Reasoners0
Mars-PO: Multi-Agent Reasoning System Preference Optimization0
A Lean Dataset for International Math Olympiad: Small Steps towards Writing Math Proofs for Hard Problems0
Embracing AI in Education: Understanding the Surge in Large Language Model Use by Secondary Students0
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTSCode0
Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval0
Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures0
Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training0
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMsCode0
RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic ProcessingCode0
OpenAI-o1 AB Testing: Does the o1 model really do good reasoning in math problem solving?0
VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM0
Meta-Reasoning Improves Tool Use in Large Language ModelsCode0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams0
Self-Consistency Preference Optimization0
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology0
Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question ClassificationCode0
Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models0
STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing0
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models0
Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses0
Improving Math Problem Solving in Large Language Models Through Categorization and Strategy Tailoring0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation0
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?Code0
Library Learning Doesn't: The Curious Case of the Single-Use "Library"Code0
Show:102550
← PrevPage 18 of 32Next →

No leaderboard results yet.