SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 851875 of 1135 papers

TitleStatusHype
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM0
DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model0
Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting0
Effectively Controlling Reasoning Models through Thinking Intervention0
Efficient Finetuning Large Language Models For Vietnamese Chatbot0
Sparse Activation Editing for Reliable Instruction Following in Narratives0
Efficient Telecom Specific LLM: TSLAM-Mini with QLoRA and Digital Twin Data0
Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation0
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following0
Embodied Instruction Following in Unknown Environments0
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization0
D-Rax: Domain-specific Radiologic assistant leveraging multi-modal data and eXpert model predictions0
Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection0
Empowering LLMs to Understand and Generate Complex Vector Graphics0
Draw Me a Flower: Processing and Grounding Abstraction in Natural Language0
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment0
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models0
Enhancing Complex Instruction Following for Large Language Models with Mixture-of-Contexts Fine-tuning0
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation0
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy0
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models0
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling0
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity0
ETHER: Aligning Emergent Communication for Hindsight Experience Replay0
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models0
Show:102550
← PrevPage 35 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified