Instruction Following
Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.
Papers
Showing 61–70 of 1135 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | AutoIF (Llama3 70B) | Inst-level loose-accuracy | 90.4 | — | Unverified |
| 2 | AutoIF (Qwen2 72B) | Inst-level loose-accuracy | 88 | — | Unverified |
| 3 | GPT-4 | Inst-level loose-accuracy | 85.37 | — | Unverified |
| 4 | PaLM 2 S | Inst-level loose-accuracy | 59.11 | — | Unverified |