Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 1596 papers

Title	Date	Tasks	Status	Hype
Automate Knowledge Concept Tagging on Math Questions with LLMs	Mar 26, 2024	Few-Shot LearningMath	—Unverified	0
To Err is Machine: Vulnerability Detection Challenges LLM Reasoning	Mar 25, 2024	Code GenerationIn-Context Learning	—Unverified	0
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?	Mar 21, 2024	MathMathematical Reasoning	—Unverified	0
A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science	Mar 21, 2024	Active LearningMath	—Unverified	0
From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision	Mar 21, 2024	Math	—Unverified	0
PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?	Mar 20, 2024	Abstractive Text SummarizationContinual Pretraining	—Unverified	0
Evolutionary Optimization of Model Merging Recipes	Mar 19, 2024	Evolutionary AlgorithmsMath	CodeCode Available	5
Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT Devices	Mar 19, 2024	Math	CodeCode Available	1
Instructing Large Language Models to Identify and Ignore Irrelevant Conditions	Mar 19, 2024	MathMathematical Reasoning	CodeCode Available	0
What Makes Math Word Problems Challenging for LLMs?	Mar 17, 2024	Math	CodeCode Available	0
An upper bound of the mutation probability in the genetic algorithm for general 0-1 knapsack problem	Mar 17, 2024	DiversityEvolutionary Algorithms	—Unverified	0
Incorporating Graph Attention Mechanism into Geometric Problem Solving Based on Deep Reinforcement Learning	Mar 14, 2024	Deep Reinforcement LearningGraph Attention	CodeCode Available	0
Hydrodynamics of Markets:Hidden Links Between Physics and Finance	Mar 14, 2024	Math	—Unverified	0
Self-Consistency Boosts Calibration for Math Reasoning	Mar 14, 2024	GSM8KMath	—Unverified	0
Sabiá-2: A New Generation of Portuguese Large Language Models	Mar 14, 2024	Math	—Unverified	0
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision	Mar 14, 2024	MathReinforcement Learning (RL)	CodeCode Available	2
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?	Mar 14, 2024	Hallucinationimage-classification	CodeCode Available	1
Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks	Mar 14, 2024	MathSkill Generalization	—Unverified	0
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models	Mar 13, 2024	Math	—Unverified	0
FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models	Mar 12, 2024	MathMathematical Reasoning	—Unverified	0
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models	Mar 12, 2024	MathMathematical Problem-Solving	CodeCode Available	0
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM	Mar 12, 2024	Arithmetic ReasoningCode Generation	—Unverified	0
Common 7B Language Models Already Possess Strong Math Capabilities	Mar 7, 2024	GSM8KMath	CodeCode Available	5
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem	Mar 6, 2024	BenchmarkingHallucination	CodeCode Available	0
MathScale: Scaling Instruction Tuning for Mathematical Reasoning	Mar 5, 2024	GSM8KMath	CodeCode Available	0
Evaluating and Optimizing Educational Content with Large Language Model Judgments	Mar 5, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
Experimenting with Generative AI: Does ChatGPT Really Increase Everyone's Productivity?	Mar 4, 2024	EconometricsMath	—Unverified	0
The Claude 3 Model Family: Opus, Sonnet, Haiku	Mar 4, 2024	1 Image, 2*2 StitchingArithmetic Reasoning	—Unverified	0
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training	Mar 4, 2024	MathPhrase Grounding	—Unverified	0
Brilla AI: AI Contestant for the National Science and Maths Quiz	Mar 4, 2024	MathQuestion Answering	CodeCode Available	1
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models	Mar 4, 2024	Data AugmentationGSM8K	CodeCode Available	1
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning	Mar 4, 2024	GSM8KMath	—Unverified	0
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning	Mar 2, 2024	MathMisconceptions	CodeCode Available	1
ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data	Mar 1, 2024	Math	—Unverified	0
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap	Feb 29, 2024	Math	CodeCode Available	2
PRSA: Prompt Stealing Attacks against Real-World Prompt Services	Feb 29, 2024	Math	—Unverified	0
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers	Feb 29, 2024	GSM8KMath	CodeCode Available	2
StarCoder 2 and The Stack v2: The Next Generation	Feb 29, 2024	Code CompletionCode Generation	CodeCode Available	7
Data Interpreter: An LLM Agent For Data Science	Feb 28, 2024	Code GenerationLanguage Modelling	—Unverified	0
Adversarial Math Word Problem Generation	Feb 27, 2024	Math	CodeCode Available	0
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning	Feb 27, 2024	8kLanguage Modeling	CodeCode Available	0
Case-Based or Rule-Based: How Do Transformers Do the Math?	Feb 27, 2024	MathSystematic Generalization	CodeCode Available	1
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs	Feb 26, 2024	GSM8KMath	—Unverified	0
Stepwise Self-Consistent Mathematical Reasoning with Large Language Models	Feb 24, 2024	MathMathematical Reasoning	CodeCode Available	1
How Do Humans Write Code? Large Models Do It the Same Way Too	Feb 24, 2024	Code GenerationMath	CodeCode Available	0
MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations	Feb 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Brain-Inspired Two-Stage Approach: Enhancing Mathematical Reasoning by Imitating Human Thought Processes	Feb 23, 2024	MathMathematical Reasoning	CodeCode Available	0
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models	Feb 22, 2024	MathMathematical Reasoning	CodeCode Available	1
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset	Feb 22, 2024	DiversityMath	CodeCode Available	2
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models	Feb 20, 2024	Common Sense ReasoningContrastive Learning	—Unverified	0

Show:10 25 50

← PrevPage 19 of 32Next →

No leaderboard results yet.