Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 171–180 of 1135 papers

Title	Date	Tasks	Status	Hype
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning	Dec 22, 2023	Instruction FollowingMixture-of-Experts	CodeCode Available	2
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step	Dec 21, 2023	Instruction FollowingRetrieval	CodeCode Available	2
LMDrive: Closed-Loop End-to-End Driving with Large Language Models	Dec 12, 2023	Autonomous DrivingInstruction Following	CodeCode Available	2
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding	Dec 4, 2023	Dense CaptioningHighlight Detection	CodeCode Available	2
GeoChat: Grounded Large Vision-Language Model for Remote Sensing	Nov 24, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning	Nov 13, 2023	Instruction FollowingMM-Vet	CodeCode Available	2
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents	Nov 9, 2023	Instruction FollowingLLM real-life tasks	CodeCode Available	2
PhoGPT: Generative Pre-training for Vietnamese	Nov 6, 2023	Instruction Following	CodeCode Available	2
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch	Nov 6, 2023	DecoderGSM8K	CodeCode Available	2
LLark: A Multimodal Instruction-Following Language Model for Music	Oct 11, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2

Show:10 25 50

← PrevPage 18 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified