Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 176–200 of 1135 papers

Title	Date	Tasks	Status	Hype
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning	Nov 13, 2023	Instruction FollowingMM-Vet	CodeCode Available	2
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents	Nov 9, 2023	Instruction FollowingLLM real-life tasks	CodeCode Available	2
PhoGPT: Generative Pre-training for Vietnamese	Nov 6, 2023	Instruction Following	CodeCode Available	2
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch	Nov 6, 2023	DecoderGSM8K	CodeCode Available	2
LLark: A Multimodal Instruction-Following Language Model for Music	Oct 11, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants	Oct 1, 2023	Instruction Following	CodeCode Available	2
ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers	Sep 28, 2023	GPUInstruction Following	CodeCode Available	2
MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models	Sep 24, 2023	Instruction Following	CodeCode Available	2
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following	Sep 1, 2023	3D Generation3D Question Answering (3D-QA)	CodeCode Available	2
LLaSM: Large Language and Speech Model	Aug 30, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning	Aug 23, 2023	Instruction Following	CodeCode Available	2
EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce	Aug 14, 2023	DiversityInstruction Following	CodeCode Available	2
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models	Aug 14, 2023	DiversityInstruction Following	CodeCode Available	2
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions	Aug 8, 2023	Caption GenerationImage Captioning	CodeCode Available	2
Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue	Aug 7, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets	Jul 20, 2023	Instruction FollowingLanguage Model Evaluation	CodeCode Available	2
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs	Jul 17, 2023	Instruction FollowingSentence	CodeCode Available	2
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?	Jul 5, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	2
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding	Jun 29, 2023	16kImage Captioning	CodeCode Available	2
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models	Jun 19, 2023	Instruction FollowingText Generation	CodeCode Available	2
LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models	Jun 15, 2023	HallucinationImage Captioning	CodeCode Available	2
MiniLLM: Knowledge Distillation of Large Language Models	Jun 14, 2023	Instruction FollowingKnowledge Distillation	CodeCode Available	2
Valley: Video Assistant with Large Language model Enhanced abilitY	Jun 12, 2023	Action RecognitionInstruction Following	CodeCode Available	2
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft	Jun 1, 2023	Decision MakingImage Generation	CodeCode Available	2
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction	May 30, 2023	Image GenerationInstruction Following	CodeCode Available	2

Show:10 25 50

← PrevPage 8 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified