Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 271–280 of 1135 papers

Title	Date	Tasks	Status	Hype
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model	Jan 13, 2025	Audio captioningInstruction Following	—Unverified	0
A Comprehensive Evaluation of Large Language Models on Mental Illnesses in Arabic Context	Jan 12, 2025	Binary ClassificationDiagnostic	—Unverified	0
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction	Jan 10, 2025	Instruction FollowingLanguage Modeling	—Unverified	0
Scalable Vision Language Model Training via High Quality Data Curation	Jan 10, 2025	Instruction FollowingLanguage Modeling	—Unverified	0
Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models	Jan 10, 2025	FormImage Comprehension	—Unverified	0
Demystifying Domain-adaptive Post-training for Financial LLMs	Jan 9, 2025	Continual PretrainingDomain Adaptation	CodeCode Available	1
LongViTU: Instruction Tuning for Long-Form Video Understanding	Jan 9, 2025	EgoSchemaForm	—Unverified	0
Language and Planning in Robotic Navigation: A Multilingual Evaluation of State-of-the-Art Models	Jan 7, 2025	Instruction FollowingVision and Language Navigation	—Unverified	0
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization	Jan 5, 2025	Instruction Following	—Unverified	0
Instruction-Following Pruning for Large Language Models	Jan 3, 2025	Instruction FollowingMath	—Unverified	0

Show:10 25 50

← PrevPage 28 of 114Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified