SOTAVerified

Math

Papers

Showing 376400 of 1596 papers

TitleStatusHype
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World ChallengesCode1
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language ModelsCode1
A Relation Spectrum Inheriting Taylor Series: Muscle Synergy and Coupling for HandCode1
A Symbolic Character-Aware Model for Solving Geometry ProblemsCode1
Evaluating and Improving Tool-Augmented Computation-Intensive Math ReasoningCode1
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational CurriculaCode1
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability TreesCode1
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical ReasoningCode1
Collective Constitutional AI: Aligning a Language Model with Public InputCode1
A Categorical Archive of ChatGPT FailuresCode1
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof DataCode1
Entropy-Regularized Process Reward ModelCode1
Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT DevicesCode1
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation CapabilitiesCode1
MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reportsCode1
Entropy-Based Adaptive Weighting for Self-TrainingCode1
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical MappingCode1
Language Models Encode the Value of Numbers LinearlyCode1
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context LearningCode1
Eliminating Position Bias of Language Models: A Mechanistic ApproachCode1
Large Language Models Can Be Easily Distracted by Irrelevant ContextCode1
Large (Vision) Language Models are Unsupervised In-Context LearnersCode1
Discovering Mathematical Objects of Interest -- A Study of Mathematical NotationsCode1
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step ReasoningCode1
Efficient Reasoning for LLMs through Speculative Chain-of-ThoughtCode1
Show:102550
← PrevPage 16 of 64Next →

No leaderboard results yet.