Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 676–700 of 1135 papers

Title	Date	Tasks	Status
Compositional Image Retrieval via Instruction-Aware Contrastive Learning	Dec 7, 2024	Contrastive LearningImage Retrieval	CodeCode Available
LLM-Align: Utilizing Large Language Models for Entity Alignment in Knowledge Graphs	Dec 6, 2024	Entity AlignmentEntity Embeddings	—Unverified
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases	Dec 6, 2024	Instruction Following	—Unverified
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs	Dec 5, 2024	Code GenerationInstruction Following	—Unverified
VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding	Dec 4, 2024	HallucinationInstruction Following	—Unverified
From Words to Workflows: Automating Business Processes	Dec 4, 2024	Decision MakingInstruction Following	—Unverified
Optimizing Latent Goal by Learning from Trajectory Preference	Dec 3, 2024	Continual LearningInstruction Following	—Unverified
T-REG: Preference Optimization with Token-Level Reward Regularization	Dec 3, 2024	Instruction Following	CodeCode Available
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation	Dec 2, 2024	Data IntegrationInstruction Following	—Unverified
AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM	Dec 2, 2024	Instruction FollowingQuestion Answering	—Unverified
MiningGPT -- A Domain-Specific Large Language Model for the Mining Industry	Dec 2, 2024	Instruction FollowingLanguage Modeling	—Unverified
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation	Dec 1, 2024	Instruction FollowingVideo Understanding	—Unverified
InsightEdit: Towards Better Instruction Following for Image Editing	Nov 26, 2024	Instruction Following	—Unverified
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors	Nov 24, 2024	Depth EstimationInstruction Following	—Unverified
From MTEB to MTOB: Retrieval-Augmented Classification for Descriptive Grammars	Nov 23, 2024	DescriptiveIn-Context Learning	CodeCode Available
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy	Nov 23, 2024	Instruction FollowingMME	—Unverified
Separable Mixture of Low-Rank Adaptation for Continual Visual Instruction Tuning	Nov 21, 2024	Continual LearningInstruction Following	—Unverified
MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection	Nov 16, 2024	DiagnosticInstruction Following	CodeCode Available
MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models	Nov 15, 2024	Instruction FollowingZero-shot Generalization	CodeCode Available
Adaptive Decoding via Latent Preference Optimization	Nov 14, 2024	GSM8KInstruction Following	—Unverified
Zero-shot Object-Centric Instruction Following: Integrating Foundation Models with Traditional Navigation	Nov 12, 2024	Instruction FollowingObject	—Unverified
Stronger Models are NOT Stronger Teachers for Instruction Tuning	Nov 11, 2024	Instruction Following	—Unverified
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory	Nov 11, 2024	Instruction FollowingMinecraft	—Unverified
LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios	Nov 11, 2024	Instruction Following	CodeCode Available
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization	Nov 9, 2024	Instruction Following	—Unverified

Show:10 25 50

← PrevPage 28 of 46Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified