Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 876–900 of 1135 papers

Title	Date	Tasks	Status
Zero-shot and Few-shot Learning with Instruction-following LLMs for Claim Matching in Automated Fact-checking	Jan 18, 2025	Binary ClassificationFact Checking	—Unverified
SpeechVerse: A Large-scale Generalizable Audio Language Model	May 14, 2024	Automatic Speech RecognitionBenchmarking	—Unverified
Evaluating Robustness of Large Audio Language Models to Audio Injection: An Empirical Study	May 26, 2025	Instruction Following	—Unverified
Evaluating the Robustness to Instructions of Large Language Models	Aug 28, 2023	Instruction FollowingRelation Extraction	—Unverified
Evaluation of Instruction-Following Ability for Large Language Models on Story-Ending Generation	Jun 24, 2024	Instruction FollowingMachine Reading Comprehension	—Unverified
Evolutionary Contrastive Distillation for Language Model Alignment	Oct 10, 2024	Contrastive LearningInstruction Following	—Unverified
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation	Jul 11, 2020	Decision MakingImitation Learning	—Unverified
SSP: A Simple and Safe automatic Prompt engineering method towards realistic image synthesis on LVM	Jan 2, 2024	Image GenerationInstruction Following	—Unverified
EXAONE 3.0 7.8B Instruction Tuned Language Model	Aug 7, 2024	Instruction FollowingLanguage Modeling	—Unverified
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases	Dec 6, 2024	Instruction Following	—Unverified
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding	Mar 12, 2025	Instruction FollowingVideo Understanding	—Unverified
Do we Really Need Visual Instructions? Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models	Feb 17, 2025	Instruction Followingvisual instruction following	—Unverified
Explicit Object Relation Alignment for Vision and Language Navigation	Nov 16, 2021	Instruction FollowingRelation	—Unverified
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks	Feb 11, 2023	Computer SecurityInstruction Following	—Unverified
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study	Jul 13, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning	Apr 19, 2024	Benchmarkingcounterfactual	—Unverified
FaceGPT: Self-supervised Learning to Chat about 3D Human Faces	Jun 11, 2024	3D Face ReconstructionFace Model	—Unverified
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation	May 29, 2019	Instruction FollowingVision and Language Navigation	—Unverified
FactLLaMA: Optimizing Instruction-Following Language Models with External Knowledge for Automated Fact-Checking	Sep 1, 2023	Fact CheckingInstruction Following	—Unverified
AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM	Dec 2, 2024	Instruction FollowingQuestion Answering	—Unverified
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs	Apr 8, 2024	Instruction Following	—Unverified
Zero-shot cross-lingual transfer in instruction tuning of large language models	Feb 22, 2024	Cross-Lingual TransferInstruction Following	—Unverified
Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning	Mar 23, 2024	Instruction Following	—Unverified
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?	Sep 14, 2024	Emotional IntelligenceInstruction Following	—Unverified
Stronger Models are NOT Stronger Teachers for Instruction Tuning	Nov 11, 2024	Instruction Following	—Unverified

Show:10 25 50

← PrevPage 36 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified