SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 621630 of 1135 papers

TitleStatusHype
MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation0
Ask, Fail, Repeat: Meeseeks, an Iterative Feedback Benchmark for LLMs' Multi-turn Instruction-Following Ability0
A Comprehensive Evaluation of Large Language Models on Mental Illnesses in Arabic Context0
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models0
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning0
ChartMind: A Comprehensive Benchmark for Complex Real-world Multimodal Chart Question Answering0
Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers0
Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization0
MIDB: Multilingual Instruction Data Booster for Enhancing Multilingual Instruction Synthesis0
A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Instruction Following Large Language Model0
Show:102550
← PrevPage 63 of 114Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified