SOTAVerified

Math

Papers

Showing 10261050 of 1596 papers

TitleStatusHype
ConvNLP: Image-based AI Text Detection0
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language ModelsCode0
Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?Code0
Smart Vision-Language ReasonersCode0
Helpful assistant or fruitful facilitator? Investigating how personas affect language model behaviorCode0
Advancing Process Verification for Large Language Models via Tree-Based Preference Learning0
CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models0
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting0
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice QuestionsCode0
Task Oriented In-Domain Data Augmentation0
Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions0
Towards Infinite-Long Prefix in TransformerCode0
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning0
Can LLMs Reason in the Wild with Programs?Code0
Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever0
Navigating the Labyrinth: Evaluating and Enhancing LLMs' Ability to Reason About Search Problems0
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image GenerationCode0
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts0
Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment0
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning0
ReMI: A Dataset for Reasoning with Multiple Images0
CLST: Cold-Start Mitigation in Knowledge Tracing by Aligning a Generative Language Model as a Students' Knowledge Tracer0
Can I understand what I create? Self-Knowledge Evaluation of Large Language Models0
Human Learning about AI0
A multi-core periphery perspective: Ranking via relative centrality0
Show:102550
← PrevPage 42 of 64Next →

No leaderboard results yet.