Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1025 of 1135 papers

Title	Date	Tasks	Status
Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models	Jan 6, 2024	Instruction FollowingMixture-of-Experts	—Unverified
Inference-Time Language Model Alignment via Integrated Value Guidance	Sep 26, 2024	Instruction FollowingLanguage Modeling	—Unverified
Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization	Jun 22, 2024	Instruction FollowingPrompt Engineering	—Unverified
Data Diversity Matters for Robust Instruction Tuning	Nov 21, 2023	DiversityInstruction Following	—Unverified
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training	Mar 4, 2025	Instruction Followingtext-to-speech	—Unverified
InsightEdit: Towards Better Instruction Following for Image Editing	Nov 26, 2024	Instruction Following	—Unverified
DAFE: LLM-Based Evaluation Through Dynamic Arbitration for Free-Form Question-Answering	Mar 11, 2025	FormInstruction Following	—Unverified
Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following	Feb 8, 2025	Instruction Following	—Unverified
InstructBooth: Instruction-following Personalized Text-to-Image Generation	Dec 4, 2023	Image GenerationInstruction Following	—Unverified
Attend and Enrich: Enhanced Visual Prompt for Zero-Shot Learning	Jun 5, 2024	AttributeDomain Generalization	—Unverified
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy	Oct 9, 2024	Instruction Following	—Unverified
InstructionCP: A fast approach to transfer Large Language Models into target language	May 30, 2024	Instruction Following	—Unverified
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game	Nov 2, 2023	Instruction Following	—Unverified
Instruction Following by Boosting Attention of Large Language Models	Jun 16, 2025	Instruction FollowingPrompt Engineering	—Unverified
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning	Mar 14, 2025	DiversityInstruction Following	—Unverified
Instruction-following Evaluation through Verbalizer Manipulation	Jul 20, 2023	Instruction Following	—Unverified
Instruction-Following Pruning for Large Language Models	Jan 3, 2025	Instruction FollowingMath	—Unverified
Instruction-Following Speech Recognition	Sep 18, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Aligner: One Global Token is Worth Millions of Parameters When Aligning Large Language Models	Dec 9, 2023	Instruction Followingparameter-efficient fine-tuning	—Unverified
Instruction Mining: Instruction Data Selection for Tuning Large Language Models	Jul 12, 2023	Instruction FollowingLanguage Modeling	—Unverified
Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation	May 13, 2025	Code GenerationIn-Context Learning	—Unverified
Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh	Feb 19, 2025	Instruction FollowingMultiple-choice	—Unverified
CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment	Oct 25, 2023	In-Context LearningInstruction Following	—Unverified
Text as Image: Learning Transferable Adapter for Multi-Label Classification	Dec 7, 2023	image-classificationImage Classification	—Unverified
Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization	Oct 7, 2024	DiversityInstruction Following	—Unverified

Show:10 25 50

← PrevPage 41 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified