Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 421–430 of 1135 papers

Title	Date	Tasks	Status	Hype
"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy	Jan 6, 2023	Instruction Following	CodeCode Available	1
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning	Dec 15, 2022	Instruction FollowingLanguage Modeling	CodeCode Available	1
Language-Conditioned Reinforcement Learning to Solve Misunderstandings with Action Corrections	Nov 18, 2022	Instruction Followingreinforcement-learning	CodeCode Available	1
Instruction-Following Agents with Multimodal Transformer	Oct 24, 2022	Instruction FollowingVisual Grounding	CodeCode Available	1
DANLI: Deliberative Agent for Following Natural Language Instructions	Oct 22, 2022	Instruction FollowingVision-Language Navigation	CodeCode Available	1
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt	Oct 6, 2022	Instruction FollowingRetrieval	CodeCode Available	1
Engineering flexible machine learning systems by traversing functionally-invariant paths	Apr 30, 2022	Adversarial RobustnessContinual Learning	CodeCode Available	1
Inferring Rewards from Language in Context	Apr 5, 2022	Instruction FollowingReinforcement Learning (RL)	CodeCode Available	1
Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation	Mar 30, 2022	counterfactualData Augmentation	CodeCode Available	1
Combining Modular Skills in Multitask Learning	Feb 28, 2022	Instruction Followingreinforcement-learning	CodeCode Available	1

Show:10 25 50

← PrevPage 43 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified