SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 271280 of 1135 papers

TitleStatusHype
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language ModelsCode1
Infer Human's Intentions Before Following Natural Language InstructionsCode1
EventHallusion: Diagnosing Event Hallucinations in Video LLMsCode1
MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object ScenariosCode1
ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and FeedbackCode1
Instruction Following without Instruction TuningCode1
ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language ModelsCode1
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form PlanningCode1
Diversify and Conquer: Diversity-Centric Data Selection with Iterative RefinementCode1
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMsCode1
Show:102550
← PrevPage 28 of 114Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified