Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1051–1075 of 1135 papers

Title	Date	Tasks	Status	Hype
UGIF: UI Grounded Instruction Following	Nov 14, 2022	Instruction FollowingNavigate	—Unverified	0
Learning to Follow Instructions in Text-Based Games	Nov 8, 2022	Decision MakingInstruction Following	CodeCode Available	0
Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following	Nov 7, 2022	Instruction FollowingLanguage Modeling	—Unverified	0
Instruction-Following Agents with Multimodal Transformer	Oct 24, 2022	Instruction FollowingVisual Grounding	CodeCode Available	1
DANLI: Deliberative Agent for Following Natural Language Instructions	Oct 22, 2022	Instruction FollowingVision-Language Navigation	CodeCode Available	1
Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue	Oct 10, 2022	Imitation LearningInstruction Following	CodeCode Available	0
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt	Oct 6, 2022	Instruction FollowingRetrieval	CodeCode Available	1
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning	Oct 6, 2022	Imitation LearningInstruction Following	—Unverified	0
Iterative Vision-and-Language Navigation	Oct 6, 2022	Instruction FollowingVision and Language Navigation	—Unverified	0
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action	Jul 10, 2022	Instruction FollowingLanguage Modeling	CodeCode Available	2
Language Models are General-Purpose Interfaces	Jun 13, 2022	Causal Language ModelingFew-Shot Learning	—Unverified	0
GoalNet: Inferring Conjunctive Goal Predicates from Human Plan Demonstrations for Robot Instruction Following	May 14, 2022	Decision MakingInstruction Following	CodeCode Available	0
Engineering flexible machine learning systems by traversing functionally-invariant paths	Apr 30, 2022	Adversarial RobustnessContinual Learning	CodeCode Available	1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks	Apr 16, 2022	BenchmarkingInstruction Following	CodeCode Available	3
Inferring Rewards from Language in Context	Apr 5, 2022	Instruction FollowingReinforcement Learning (RL)	CodeCode Available	1
Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation	Mar 30, 2022	counterfactualData Augmentation	CodeCode Available	1
Summarizing a virtual robot's past actions in natural language	Mar 13, 2022	Instruction Following	—Unverified	0
Combining Modular Skills in Multitask Learning	Feb 28, 2022	Instruction Followingreinforcement-learning	CodeCode Available	1
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following	Feb 27, 2022	Instruction FollowingNavigate	CodeCode Available	1
Compositionality as Lexical Symmetry	Jan 30, 2022	Data AugmentationInductive Bias	CodeCode Available	0
Less is More: Generating Grounded Navigation Instructions from Landmarks	Nov 25, 2021	DecoderInstruction Following	—Unverified	0
Explicit Object Relation Alignment for Vision and Language Navigation	Nov 16, 2021	Instruction FollowingRelation	—Unverified	0
Skill Induction and Planning with Latent Language	Nov 16, 2021	Decision MakingInstruction Following	—Unverified	0
Guiding Multi-Step Rearrangement Tasks with Natural Language Instructions	Nov 8, 2021	Instruction Following	CodeCode Available	1
Compositional Data and Task Augmentation for Instruction Following	Nov 1, 2021	Instruction Following	—Unverified	0

Show:10 25 50

← PrevPage 43 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified