Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 891–900 of 1135 papers

Title	Date	Tasks	Status
Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning	Apr 19, 2024	Benchmarkingcounterfactual	—Unverified
FaceGPT: Self-supervised Learning to Chat about 3D Human Faces	Jun 11, 2024	3D Face ReconstructionFace Model	—Unverified
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation	May 29, 2019	Instruction FollowingVision and Language Navigation	—Unverified
FactLLaMA: Optimizing Instruction-Following Language Models with External Knowledge for Automated Fact-Checking	Sep 1, 2023	Fact CheckingInstruction Following	—Unverified
AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM	Dec 2, 2024	Instruction FollowingQuestion Answering	—Unverified
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs	Apr 8, 2024	Instruction Following	—Unverified
Zero-shot cross-lingual transfer in instruction tuning of large language models	Feb 22, 2024	Cross-Lingual TransferInstruction Following	—Unverified
Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning	Mar 23, 2024	Instruction Following	—Unverified
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?	Sep 14, 2024	Emotional IntelligenceInstruction Following	—Unverified
Stronger Models are NOT Stronger Teachers for Instruction Tuning	Nov 11, 2024	Instruction Following	—Unverified

Show:10 25 50

← PrevPage 90 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified