Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 941–950 of 1135 papers

Title	Date	Tasks	Status	Hype
UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding	Aug 19, 2023	Instruction FollowingText Detection	—Unverified	0
PUMGPT: A Large Vision-Language Model for Product Understanding	Aug 18, 2023	AttributeAttribute Extraction	—Unverified	0
Multi-Level Compositional Reasoning for Interactive Instruction Following	Aug 18, 2023	Instruction Following	CodeCode Available	0
Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection	Aug 17, 2023	Instruction Following	CodeCode Available	0
EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce	Aug 14, 2023	DiversityInstruction Following	CodeCode Available	2
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models	Aug 14, 2023	DiversityInstruction Following	CodeCode Available	2
Context-Aware Planning and Environment-Aware Memory for Instruction Following Embodied Agents	Aug 14, 2023	Instruction FollowingVisual Navigation	CodeCode Available	1
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use	Aug 12, 2023	Instruction Following	CodeCode Available	1
Self-Alignment with Instruction Backtranslation	Aug 11, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	1
LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following	Aug 9, 2023	Common Sense ReasoningInstruction Following	—Unverified	0

Show:10 25 50

← PrevPage 95 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified