SOTAVerified|Agents Browse Leaderboard About

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1101–1110 of 1135 papers

Title	Date	Tasks	Status
Rewrite to Jailbreak: Discover Learnable and Transferable Implicit Harmfulness Instruction	Feb 16, 2025	Instruction Following	CodeCode Available
MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models	Nov 15, 2024	Instruction FollowingZero-shot Generalization	CodeCode Available
Guiding Policies with Language via Meta-Learning	Nov 19, 2018	Imitation LearningInstruction Following	CodeCode Available
Learning To Follow Directions in Street View	Mar 1, 2019	Deep Reinforcement LearningInstruction Following	CodeCode Available
Learning to Follow Instructions in Text-Based Games	Nov 8, 2022	Decision MakingInstruction Following	CodeCode Available
Self-Powered LLM Modality Expansion for Large Speech-Text Models	Oct 4, 2024	Automatic Speech RecognitionInstruction Following	CodeCode Available
Towards Interactive Deepfake Analysis	Jan 2, 2025	DeepFake DetectionFace Swapping	CodeCode Available
Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing	Feb 24, 2025	Instruction FollowingModel Selection	CodeCode Available
PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models	Dec 9, 2024	BenchmarkingInstruction Following	CodeCode Available
Learning to Recombine and Resample Data for Compositional Generalization	Oct 8, 2020	Data AugmentationInstruction Following	CodeCode Available

Show:10 25 50

← PrevPage 111 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified