Benchmarking Defeasible Reasoning with Large Language Models -- Initial Experiments and Future Directions
2024-10-16Unverified0· sign in to hype
Ilias Tachmazidis, Sotiris Batsakis, Grigoris Antoniou
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Large Language Models (LLMs) have gained prominence in the AI landscape due to their exceptional performance. Thus, it is essential to gain a better understanding of their capabilities and limitations, among others in terms of nonmonotonic reasoning. This paper proposes a benchmark that corresponds to various defeasible rule-based reasoning patterns. We modified an existing benchmark for defeasible logic reasoners by translating defeasible rules into text suitable for LLMs. We conducted preliminary experiments on nonmonotonic rule-based reasoning using ChatGPT and compared it with reasoning patterns defined by defeasible logic.