SOTAVerified

Math

Papers

Showing 751800 of 1596 papers

TitleStatusHype
Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation0
Entropy Martingale Optimal Transport and Nonlinear Pricing-Hedging Duality0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation0
Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework0
The Effect of Teacher Gender on Student Achievement in Primary School0
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation0
The Entropic Measure Transform0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams0
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics0
The Function Transformation Omics - Funomics0
Evaluating Robustness of Reward Models for Mathematical Reasoning0
Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages0
Can I understand what I create? Self-Knowledge Evaluation of Large Language Models0
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization0
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers0
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil0
Examining the Robustness of Large Language Models across Language Complexity0
Wavelet GPT: Wavelet Inspired Large Language Models0
Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia0
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate0
Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them0
Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases0
Calculus on MDPs: Potential Shaping as a Gradient0
Exploring the Mystery of Influential Data for Mathematical Reasoning0
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning0
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity0
Extracting the Unknown from Long Math Problems0
Fairness Hub Technical Briefs: AUC Gap0
Fairshare Data Pricing via Data Valuation for Large Language Models0
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean40
BurTorch: Revisiting Training from First Principles by Coupling Autodiff, Math Optimization, and Systems0
Fast Diffusion Inhibits Disease Outbreaks0
Faster and Better LLMs via Latency-Aware Test-Time Scaling0
Feature Selection Based on Confidence Machine0
The Impact of Item-Writing Flaws on Difficulty and Discrimination in Item Response Theory0
Few-Shot Recalibration of Language Models0
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning0
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models0
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian0
Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models0
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning0
Fixation probabilities for the Moran process in evolutionary games with two strategies: graph shapes and large population asymptotics0
Fixation probabilities for the Moran process with three or more strategies: general and coupling results0
Building Math Agents with Multi-Turn Iterative Preference Learning0
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration0
The Logic of Political Survival Revisited: Consequences of Elite Uncertainty Under Authoritarian Rule0
Formal Mathematical Reasoning: A New Frontier in AI0
The Long-Term Effects of Teachers' Gender Stereotypes0
fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models0
Show:102550
← PrevPage 16 of 32Next →

No leaderboard results yet.