Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1010 of 1135 papers

Title	Date	Tasks	Status	Hype
Schema-Driven Information Extraction from Heterogeneous Tables	May 23, 2023	Attribute ExtractionInstruction Following	CodeCode Available	1
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback	May 22, 2023	Instruction Following	CodeCode Available	3
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance	May 22, 2023	Instruction Following	—Unverified	0
Lion: Adversarial Distillation of Proprietary Large Language Models	May 22, 2023	Instruction FollowingKnowledge Distillation	CodeCode Available	2
LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation	May 19, 2023	Image GenerationInstruction Following	CodeCode Available	1
Multimodal Web Navigation with Instruction-Finetuned Foundation Models	May 19, 2023	Autonomous Web NavigationInstruction Following	—Unverified	0
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors	May 18, 2023	Instruction FollowingQuestion Answering	CodeCode Available	1
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models	May 17, 2023	Instruction FollowingMultiple-choice	CodeCode Available	1
Recommendation as Instruction Following: A Large Language Model Empowered Recommendation Approach	May 11, 2023	Instruction FollowingLanguage Modeling	—Unverified	0
Accessible Instruction-Following Agent	May 8, 2023	Instruction FollowingLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 101 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified