SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 576600 of 1135 papers

TitleStatusHype
DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models0
Distilling Internet-Scale Vision-Language Models into Embodied Agents0
Generalization in Instruction Following Systems0
Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers?0
Generate Subgoal Images before Act: Unlocking the Chain-of-Thought Reasoning in Diffusion Model for Robot Manipulation with Multimodal Prompts0
Conversational Code Generation: a Case Study of Designing a Dialogue System for Generating Driving Scenarios for Testing Autonomous Vehicles0
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models0
Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning0
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models0
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control0
Got Compute, but No Data: Lessons From Post-training a Finnish LLM0
Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling0
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation0
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding0
Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models0
GROOT-2: Weakly Supervised Multi-Modal Instruction Following Agents0
GROOT: Learning to Follow Instructions by Watching Gameplay Videos0
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective0
Grounding Language by Continuous Observation of Instruction Following0
Language Conditioned Imitation Learning over Unstructured Data0
Ground-level Viewpoint Vision-and-Language Navigation in Continuous Environments0
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels0
Guided Adaptive Credit Assignment for Sample Efficient Policy Optimization0
Differential Information: An Information-Theoretic Perspective on Preference Optimization0
Systematic Evaluation of Long-Context LLMs on Financial Concepts0
Show:102550
← PrevPage 24 of 46Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified