SOTAVerified

Math

Papers

Showing 11011125 of 1596 papers

TitleStatusHype
Scaling up ridge regression for brain encoding in a massive individual fMRI datasetCode0
Large Language Models Are Struggle to Cope with Unreasonability in Math Problems0
ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain0
Few-Shot Recalibration of Language Models0
The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian0
Automate Knowledge Concept Tagging on Math Questions with LLMs0
To Err is Machine: Vulnerability Detection Challenges LLM Reasoning0
From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision0
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science0
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?0
PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?0
Instructing Large Language Models to Identify and Ignore Irrelevant ConditionsCode0
An upper bound of the mutation probability in the genetic algorithm for general 0-1 knapsack problem0
What Makes Math Word Problems Challenging for LLMs?Code0
Incorporating Graph Attention Mechanism into Geometric Problem Solving Based on Deep Reinforcement LearningCode0
Sabiá-2: A New Generation of Portuguese Large Language Models0
Hydrodynamics of Markets:Hidden Links Between Physics and Finance0
Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks0
Self-Consistency Boosts Calibration for Math Reasoning0
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models0
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small ModelsCode0
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models0
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM0
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word ProblemCode0
Evaluating and Optimizing Educational Content with Large Language Model JudgmentsCode0
Show:102550
← PrevPage 45 of 64Next →

No leaderboard results yet.