REL: Working out is all you need

2024-12-05Code Available0· sign in to hype

Toby Simonds, Jey Han Lau, Chaithanya Bandi

Code Available — Be the first to reproduce this paper.

Code

github.com/tamassimonds/rel
OfficialIn papernone★ 7

Abstract

Recent developments, particularly OpenAI's O1 model, have demonstrated the remarkable potential of Large Language Models (LLMs) for complex reasoning tasks. Through analysis of O1's outputs and provided sample Chain-of-Thought (CoT) demonstrations, we observe that it approaches problem-solving in a distinctly human-like manner, systematically brainstorming ideas, testing hypotheses, verifying results, and planning comprehensive solutions. These sophisticated reasoning capabilities remain notably absent in other state-of-the-art language models. In this paper, we hypothesize that this performance gap stems from the limited availability of high-quality reasoning process data in current training sets. We demonstrate that by constructing a specialized dataset focused on explicit problem-solving workflows ("worked solutions"), we can elicit substantially improved planning capabilities from existing models. Additionally, we propose the Reasoning Enhancement Loop (REL), a method for generating synthetic worked solutions.

Tasks

All

REL: Working out is all you need

Code

Abstract

Tasks

Reproductions