Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 851–875 of 1135 papers

Title	Date	Tasks	Status
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM	Dec 12, 2024	Image ComprehensionImage Generation	—Unverified
DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model	Oct 14, 2024	DiversityInstruction Following	—Unverified
Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting	Mar 9, 2025	Instruction FollowingLarge Language Model	—Unverified
Effectively Controlling Reasoning Models through Thinking Intervention	Mar 31, 2025	Instruction FollowingSafety Alignment	—Unverified
Efficient Finetuning Large Language Models For Vietnamese Chatbot	Sep 9, 2023	ChatbotInstruction Following	—Unverified
Sparse Activation Editing for Reliable Instruction Following in Narratives	May 22, 2025	Instruction Following	—Unverified
Efficient Telecom Specific LLM: TSLAM-Mini with QLoRA and Digital Twin Data	May 10, 2025	Instruction Followingparameter-efficient fine-tuning	—Unverified
Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation	Sep 20, 2024	Code GenerationInstruction Following	—Unverified
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following	Apr 7, 2023	Instruction FollowingSelf-Supervised Learning	—Unverified
Embodied Instruction Following in Unknown Environments	Jun 17, 2024	Instruction FollowingTask Planning	—Unverified
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization	Sep 16, 2024	Emotional Speech SynthesisIn-Context Learning	—Unverified
D-Rax: Domain-specific Radiologic assistant leveraging multi-modal data and eXpert model predictions	Jul 2, 2024	DiagnosticInstruction Following	—Unverified
Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection	Aug 7, 2024	Instruction Following	—Unverified
Empowering LLMs to Understand and Generate Complex Vector Graphics	Dec 15, 2024	Instruction FollowingVector Graphics	—Unverified
Draw Me a Flower: Processing and Grounding Abstraction in Natural Language	Jun 27, 2021	4kInstruction Following	—Unverified
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment	May 6, 2024	Arithmetic ReasoningCode Generation	—Unverified
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models	Feb 21, 2024	Backdoor AttackFew-Shot Learning	—Unverified
Enhancing Complex Instruction Following for Large Language Models with Mixture-of-Contexts Fine-tuning	May 17, 2025	DecoderInstruction Following	—Unverified
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation	Dec 2, 2024	Data IntegrationInstruction Following	—Unverified
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy	Nov 23, 2024	Instruction FollowingMME	—Unverified
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models	Sep 17, 2024	Audio captioningInstruction Following	—Unverified
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling	Oct 15, 2024	Instruction FollowingKnowledge Distillation	—Unverified
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity	Aug 29, 2024	Code GenerationDiversity	—Unverified
ETHER: Aligning Emergent Communication for Hindsight Experience Replay	Jul 28, 2023	Inductive BiasInstruction Following	—Unverified
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models	May 14, 2024	Adversarial RobustnessInstruction Following	—Unverified

Show:10 25 50

← PrevPage 35 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified