SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 851900 of 1135 papers

TitleStatusHype
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM0
DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model0
Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting0
Effectively Controlling Reasoning Models through Thinking Intervention0
Efficient Finetuning Large Language Models For Vietnamese Chatbot0
Sparse Activation Editing for Reliable Instruction Following in Narratives0
Efficient Telecom Specific LLM: TSLAM-Mini with QLoRA and Digital Twin Data0
Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation0
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following0
Embodied Instruction Following in Unknown Environments0
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization0
D-Rax: Domain-specific Radiologic assistant leveraging multi-modal data and eXpert model predictions0
Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection0
Empowering LLMs to Understand and Generate Complex Vector Graphics0
Draw Me a Flower: Processing and Grounding Abstraction in Natural Language0
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment0
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models0
Enhancing Complex Instruction Following for Large Language Models with Mixture-of-Contexts Fine-tuning0
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation0
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy0
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models0
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling0
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity0
ETHER: Aligning Emergent Communication for Hindsight Experience Replay0
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models0
Zero-shot and Few-shot Learning with Instruction-following LLMs for Claim Matching in Automated Fact-checking0
SpeechVerse: A Large-scale Generalizable Audio Language Model0
Evaluating Robustness of Large Audio Language Models to Audio Injection: An Empirical Study0
Evaluating the Robustness to Instructions of Large Language Models0
Evaluation of Instruction-Following Ability for Large Language Models on Story-Ending Generation0
Evolutionary Contrastive Distillation for Language Model Alignment0
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation0
SSP: A Simple and Safe automatic Prompt engineering method towards realistic image synthesis on LVM0
EXAONE 3.0 7.8B Instruction Tuned Language Model0
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases0
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding0
Do we Really Need Visual Instructions? Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models0
Explicit Object Relation Alignment for Vision and Language Navigation0
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks0
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study0
Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning0
FaceGPT: Self-supervised Learning to Chat about 3D Human Faces0
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation0
FactLLaMA: Optimizing Instruction-Following Language Models with External Knowledge for Automated Fact-Checking0
AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM0
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs0
Zero-shot cross-lingual transfer in instruction tuning of large language models0
Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning0
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?0
Stronger Models are NOT Stronger Teachers for Instruction Tuning0
Show:102550
← PrevPage 18 of 23Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified