SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 110 of 1135 papers

TitleStatusHype
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning0
How Many Instructions Can LLMs Follow at Once?0
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil EngineeringCode2
Multilingual Multimodal Software Developer for Code Generation0
TuneShield: Mitigating Toxicity in Conversational AI while Fine-tuning on Untrusted Data0
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal AlignmentCode2
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection AttacksCode2
Kwai Keye-VL Technical ReportCode4
Bridging Offline and Online Reinforcement Learning for LLMs0
LLaVA-Pose: Enhancing Human Pose and Action Understanding via Keypoint-Integrated Instruction TuningCode0
Show:102550
← PrevPage 1 of 114Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified