Benchmarking Defeasible Reasoning with Large Language Models -- Initial Experiments and Future Directions

2024-10-16Unverified0· sign in to hype

Ilias Tachmazidis, Sotiris Batsakis, Grigoris Antoniou

Unverified — Be the first to reproduce this paper.

Abstract

Large Language Models (LLMs) have gained prominence in the AI landscape due to their exceptional performance. Thus, it is essential to gain a better understanding of their capabilities and limitations, among others in terms of nonmonotonic reasoning. This paper proposes a benchmark that corresponds to various defeasible rule-based reasoning patterns. We modified an existing benchmark for defeasible logic reasoners by translating defeasible rules into text suitable for LLMs. We conducted preliminary experiments on nonmonotonic rule-based reasoning using ChatGPT and compared it with reasoning patterns defined by defeasible logic.

Tasks

Benchmarking

Benchmarking Defeasible Reasoning with Large Language Models -- Initial Experiments and Future Directions

Abstract

Tasks

Reproductions