SOTAVerified

Math

Papers

Showing 14011450 of 1596 papers

TitleStatusHype
Gemma 3 Technical Report0
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data0
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM0
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics0
Generate & Rank: A Multi-task Framework for Math Word Problems0
Generating Equation by Utilizing Operators : GEO model0
Controlling Equational Reasoning in Large Language Models with Prompt Interventions0
Generating Math Word Problems from Equations with Topic Controlling and Commonsense Enforcement0
Generating Narrated Lecture Videos from Slides with Synchronized Highlights0
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning0
Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions0
Generative Discovery of Partial Differential Equations by Learning from Math Handbooks0
Generative Verifiers: Reward Modeling as Next-Token Prediction0
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning0
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning0
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models0
The Perfect Blend: Redefining RLHF with Mixture of Judges0
Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension0
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements0
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry20
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning0
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation0
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable0
GPT takes the SAT: Tracing changes in Test Difficulty and Math Performance of Students0
GPU Domain Specialization via Composable On-Package Architecture0
Graders should cheat: privileged information enables expert-level automated evaluations0
Graph2Tac: Online Representation Learning of Formal Math Concepts0
GRIN: GRadient-INformed MoE0
BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts0
Blink of an eye: a simple theory for feature localization in generative models0
GSSF: A Generative Sequence Similarity Function based on a Seq2Seq model for clustering online handwritten mathematical answers0
Guideline Forest: Experience-Induced Multi-Guideline Reasoning with Stepwise Aggregation0
Guiding Language Model Reasoning with Planning Tokens0
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders0
The Role of Diversity in In-Context Learning for Large Language Models0
The Search-and-Mix Paradigm in Approximate Nash Equilibrium Algorithms0
Hard Math -- Easy UVM: Pragmatic solutions for verifying hardware algorithms using UVM0
The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?0
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation0
Hawkeye:Efficient Reasoning with Model Collaboration0
Heimdall: test-time scaling on the generative verification0
HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks0
hep-th0
Herald: A Natural Language Annotated Lean 4 Dataset0
Hierarchical Attention Decoder for Solving Math Word Problems0
Hierarchical evolutive systems, fuzzy categories and the living single cell0
WebMIaS on Docker: Deploying Math-Aware Search in a Single Line of Code0
Homeostatic Mechanisms in Biological Systems0
Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal0
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study0
Show:102550
← PrevPage 29 of 32Next →

No leaderboard results yet.