Known Unknowns
Language models have a tendency to generate text containing false statements that are often referred to as "Hallucinations." The primary purpose of this task is to test for this failure case by probing whether a model can correctly identify that the answer to a question is unknown. A common failure mode would be to prefer a prediction of false on unknown truth over a prediction that the answer is unknown.
Source: BIG-bench
Papers
Showing 11–15 of 15 papers
No leaderboard results yet.