SOTAVerified

Math

Papers

Showing 501550 of 1596 papers

TitleStatusHype
Large (Vision) Language Models are Unsupervised In-Context LearnersCode1
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis ModelsCode1
Explaining Datasets in Words: Statistical Models with Natural Language ParametersCode1
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem UnderstandingCode1
Improving the Validity of Automatically Generated Feedback via Reinforcement LearningCode1
From GAN to WGANCode1
EXAONE Deep: Reasoning Enhanced Language ModelsCode1
Evolving Prompts In-Context: An Open-ended, Self-replicating PerspectiveCode1
Injecting Numerical Reasoning Skills into Language ModelsCode1
Language Models as Science TutorsCode1
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof DataCode1
Self-Training Elicits Concise Reasoning in Large Language ModelsCode1
Examining the Robustness of Large Language Models across Language Complexity0
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil0
Can Stories Help LLMs Reason? Curating Information Space Through Narrative0
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization0
Can LLMs understand Math? -- Exploring the Pitfalls in Mathematical Reasoning0
A range characterization of the single-quadrant ADRT0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages0
Illinois Math Solver: Math Reasoning on the Web0
AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models0
Identifying equivalent Calabi--Yau topologies: A discrete challenge from math and physics for machine learning0
Improve Mathematical Reasoning in Language Models by Automated Process Supervision0
Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning0
Evaluating Robustness of Reward Models for Mathematical Reasoning0
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation0
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions0
HyperCLOVA X Technical Report0
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics0
Human Learning about AI0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams0
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science0
Hydrodynamics of Markets:Hidden Links Between Physics and Finance0
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models0
Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations0
Can I understand what I create? Self-Knowledge Evaluation of Large Language Models0
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate0
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio0
Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework0
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning0
How well do Computers Solve Math Word Problems? Large-Scale Dataset Construction and Evaluation0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation0
Approximation properties of Residual Neural Networks for Kolmogorov PDEs0
Entropy Martingale Optimal Transport and Nonlinear Pricing-Hedging Duality0
Calculus on MDPs: Potential Shaping as a Gradient0
Approximating Sparse PCA from Incomplete Data0
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation0
BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and Systems0
Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference0
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity0
Show:102550
← PrevPage 11 of 32Next →

No leaderboard results yet.