Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–450 of 1135 papers

Title	Date	Tasks	Status	Hype
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt	Oct 6, 2022	Instruction FollowingRetrieval	CodeCode Available	1
Engineering flexible machine learning systems by traversing functionally-invariant paths	Apr 30, 2022	Adversarial RobustnessContinual Learning	CodeCode Available	1
Inferring Rewards from Language in Context	Apr 5, 2022	Instruction FollowingReinforcement Learning (RL)	CodeCode Available	1
Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation	Mar 30, 2022	counterfactualData Augmentation	CodeCode Available	1
Combining Modular Skills in Multitask Learning	Feb 28, 2022	Instruction Followingreinforcement-learning	CodeCode Available	1
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following	Feb 27, 2022	Instruction FollowingNavigate	CodeCode Available	1
Guiding Multi-Step Rearrangement Tasks with Natural Language Instructions	Nov 8, 2021	Instruction Following	CodeCode Available	1
FILM: Following Instructions in Language with Modular Methods	Oct 12, 2021	Imitation LearningInstruction Following	CodeCode Available	1
Waypoint Models for Instruction-guided Navigation in Continuous Environments	Oct 5, 2021	Instruction FollowingVisual Navigation	CodeCode Available	1
Lexicon Learning for Few Shot Sequence Modeling	Aug 1, 2021	Instruction FollowingMachine Translation	CodeCode Available	1
Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression	Jun 19, 2021	Instruction FollowingNavigate	CodeCode Available	1
Lexicon Learning for Few-Shot Neural Sequence Modeling	Jun 7, 2021	Instruction FollowingMachine Translation	CodeCode Available	1
A modular vision language navigation and manipulation framework for long horizon compositional tasks in indoor environment	Jan 19, 2021	Instruction FollowingVision-Language Navigation	CodeCode Available	1
Factorizing Perception and Policy for Interactive Instruction Following	Dec 6, 2020	Instruction FollowingNavigate	CodeCode Available	1
Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following	Nov 14, 2020	continuous-controlContinuous Control	CodeCode Available	1
RMM: A Recursive Mental Model for Dialogue Navigation	Nov 1, 2020	Answer GenerationInstruction Following	CodeCode Available	1
AllenAct: A Framework for Embodied AI Research	Aug 28, 2020	Deep Reinforcement LearningEmbodied Question Answering	CodeCode Available	1
RMM: A Recursive Mental Model for Dialog Navigation	May 2, 2020	Answer GenerationInstruction Following	CodeCode Available	1
Zero-Shot Compositional Policy Learning via Language Grounding	Apr 15, 2020	DescriptiveDomain Adaptation	CodeCode Available	1
Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight	Oct 21, 2019	continuous-controlContinuous Control	CodeCode Available	1
Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning	May 31, 2018	Imitation LearningInstruction Following	CodeCode Available	1
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning	Jul 17, 2025	Instruction Following	—Unverified	0
How Many Instructions Can LLMs Follow at Once?	Jul 15, 2025	Instruction Following	—Unverified	0
Multilingual Multimodal Software Developer for Code Generation	Jul 11, 2025	Code GenerationInstruction Following	—Unverified	0
TuneShield: Mitigating Toxicity in Conversational AI while Fine-tuning on Untrusted Data	Jul 8, 2025	ChatbotInstruction Following	—Unverified	0

Show:10 25 50

← PrevPage 18 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified