Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 1135 papers

Title	Date	Tasks	Status	Hype
ThinkLess: A Training-Free Inference-Efficient Method for Reducing Reasoning Redundancy	May 21, 2025	Instruction FollowingTransfer Learning	—Unverified	0
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective	May 21, 2025	Instruction FollowingLanguage Modeling	—Unverified	0
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought	May 21, 2025	ChatbotInstruction Following	—Unverified	0
FlowKV: Enhancing Multi-Turn Conversational Coherence in LLMs via Isolated Key-Value Cache Management	May 21, 2025	Instruction FollowingManagement	—Unverified	0
Joint Flashback Adaptation for Forgetting-Resistant Instruction Tuning	May 21, 2025	Arithmetic ReasoningInstruction Following	—Unverified	0
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models	May 20, 2025	Instruction FollowingMathematical Reasoning	CodeCode Available	1
Domain Adaptation of VLM for Soccer Video Understanding	May 20, 2025	Action ClassificationDomain Adaptation	—Unverified	0
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training	May 20, 2025	AllDomain Generalization	—Unverified	0
DecIF: Improving Instruction-Following through Meta-Decomposition	May 20, 2025	Instruction FollowingResponse Generation	—Unverified	0
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels	May 20, 2025	Instruction FollowingKnowledge Distillation	—Unverified	0

Show:10 25 50

← PrevPage 9 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified