SOTAVerified

Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Showing 251260 of 1135 papers

TitleStatusHype
Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models0
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to ImitateCode2
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model ScalingCode11
MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMsCode2
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow0
How well can LLMs Grade Essays in Arabic?0
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages0
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling CapabilitiesCode3
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual FeedbackCode2
Online Preference Alignment for Language Models via Count-based ExplorationCode1
Show:102550
← PrevPage 26 of 114Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AutoIF (Llama3 70B)Inst-level loose-accuracy90.4Unverified
2AutoIF (Qwen2 72B)Inst-level loose-accuracy88Unverified
3GPT-4Inst-level loose-accuracy85.37Unverified
4PaLM 2 SInst-level loose-accuracy59.11Unverified