SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 171180 of 1135 papers

TitleStatusHype
Open-World Skill Discovery from Unsegmented Demonstrations0
Robust Multi-Objective Controlled Decoding of Large Language ModelsCode0
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation ModelCode2
XIFBench: Evaluating Large Language Models on Multilingual Instruction Following0
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMsCode2
REF-VLM: Triplet-Based Referring Paradigm for Unified Visual DecodingCode1
Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting0
WildIFEval: Instruction Following in the WildCode0
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMsCode2
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information0
Show:102550
← PrevPage 18 of 114Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified