SOTAVerified

Math

Papers

Showing 11261150 of 1596 papers

TitleStatusHype
MathScale: Scaling Instruction Tuning for Mathematical ReasoningCode0
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning0
The Claude 3 Model Family: Opus, Sonnet, Haiku0
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training0
Experimenting with Generative AI: Does ChatGPT Really Increase Everyone's Productivity?0
ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data0
PRSA: Prompt Stealing Attacks against Real-World Prompt Services0
Data Interpreter: An LLM Agent For Data Science0
Adversarial Math Word Problem GenerationCode0
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical ReasoningCode0
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs0
How Do Humans Write Code? Large Models Do It the Same Way TooCode0
Brain-Inspired Two-Stage Approach: Enhancing Mathematical Reasoning by Imitating Human Thought ProcessesCode0
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models0
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks0
Orca-Math: Unlocking the potential of SLMs in Grade School Math0
Mathematical Opportunities in Digital Twins (MATH-DT)0
Language Models with Conformal Factuality Guarantees0
AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and GuardrailsCode0
Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications0
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages0
Understanding the Progression of Educational Topics via Semantic Matching0
V-STaR: Training Verifiers for Self-Taught Reasoners0
In-Context Principle Learning from MistakesCode0
Show:102550
← PrevPage 46 of 64Next →

No leaderboard results yet.