Instruction Following

Instruction following is the basic task of the model. This task is dedicated to evaluating the ability of the large model to follow human instructions. It is hoped that the model can generate controllable and safe answers.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 1135 papers

Title	Date	Tasks	Status	Hype
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models	Dec 16, 2024	Instruction Following	CodeCode Available	1
LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts	Dec 16, 2024	General KnowledgeInstruction Following	CodeCode Available	2
ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation	Dec 15, 2024	Instruction Following	—Unverified	0
Leveraging Large Vision-Language Model as User Intent-aware Encoder for Composed Image Retrieval	Dec 15, 2024	Image RetrievalInstruction Following	—Unverified	0
Empowering LLMs to Understand and Generate Complex Vector Graphics	Dec 15, 2024	Instruction FollowingVector Graphics	—Unverified	0
VLR-Bench: Multilingual Benchmark Dataset for Vision-Language Retrieval Augmented Generation	Dec 13, 2024	Instruction FollowingQuestion Answering	—Unverified	0
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM	Dec 12, 2024	Image ComprehensionImage Generation	—Unverified	0
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs	Dec 11, 2024	ARCGSM8K	—Unverified	0
LLaVA-Zip: Adaptive Visual Token Compression with Intrinsic Image Information	Dec 11, 2024	Data AugmentationInstruction Following	—Unverified	0
LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements	Dec 9, 2024	Decision MakingInstruction Following	—Unverified	0
PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models	Dec 9, 2024	BenchmarkingInstruction Following	CodeCode Available	0
Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families	Dec 9, 2024	Emotional IntelligenceInstruction Following	CodeCode Available	0
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models	Dec 8, 2024	Instruction FollowingNatural Language Understanding	CodeCode Available	1
GROOT-2: Weakly Supervised Multi-Modal Instruction Following Agents	Dec 7, 2024	Instruction Following	—Unverified	0
Compositional Image Retrieval via Instruction-Aware Contrastive Learning	Dec 7, 2024	Contrastive LearningImage Retrieval	CodeCode Available	0
RSUniVLM: A Unified Vision Language Model for Remote Sensing via Granularity-oriented Mixture of Experts	Dec 7, 2024	Change DetectionImage Comprehension	CodeCode Available	1
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases	Dec 6, 2024	Instruction Following	—Unverified	0
LLM-Align: Utilizing Large Language Models for Entity Alignment in Knowledge Graphs	Dec 6, 2024	Entity AlignmentEntity Embeddings	—Unverified	0
If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs	Dec 5, 2024	Code GenerationInstruction Following	—Unverified	0
VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding	Dec 4, 2024	HallucinationInstruction Following	—Unverified	0
From Words to Workflows: Automating Business Processes	Dec 4, 2024	Decision MakingInstruction Following	—Unverified	0
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation	Dec 4, 2024	Instruction Following	CodeCode Available	1
Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases	Dec 3, 2024	Instruction Following	CodeCode Available	1
Optimizing Latent Goal by Learning from Trajectory Preference	Dec 3, 2024	Continual LearningInstruction Following	—Unverified	0
T-REG: Preference Optimization with Token-Level Reward Regularization	Dec 3, 2024	Instruction Following	CodeCode Available	0
AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM	Dec 2, 2024	Instruction FollowingQuestion Answering	—Unverified	0
MiningGPT -- A Domain-Specific Large Language Model for the Mining Industry	Dec 2, 2024	Instruction FollowingLanguage Modeling	—Unverified	0
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation	Dec 2, 2024	Data IntegrationInstruction Following	—Unverified	0
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation	Dec 1, 2024	Instruction FollowingVideo Understanding	—Unverified	0
InsightEdit: Towards Better Instruction Following for Image Editing	Nov 26, 2024	Instruction Following	—Unverified	0
ShowUI: One Vision-Language-Action Model for GUI Visual Agent	Nov 26, 2024	Instruction FollowingNatural Language Visual Grounding	CodeCode Available	5
Parameter Efficient Instruction Tuning: An Empirical Study	Nov 25, 2024	Instruction FollowingMemorization	CodeCode Available	4
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors	Nov 24, 2024	Depth EstimationInstruction Following	—Unverified	0
From MTEB to MTOB: Retrieval-Augmented Classification for Descriptive Grammars	Nov 23, 2024	DescriptiveIn-Context Learning	CodeCode Available	0
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy	Nov 23, 2024	Instruction FollowingMME	—Unverified	0
Separable Mixture of Low-Rank Adaptation for Continual Visual Instruction Tuning	Nov 21, 2024	Continual LearningInstruction Following	—Unverified	0
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding	Nov 16, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	2
MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection	Nov 16, 2024	DiagnosticInstruction Following	CodeCode Available	0
MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models	Nov 15, 2024	Instruction FollowingZero-shot Generalization	CodeCode Available	0
Adaptive Decoding via Latent Preference Optimization	Nov 14, 2024	GSM8KInstruction Following	—Unverified	0
LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation	Nov 14, 2024	Earth ObservationInstruction Following	CodeCode Available	2
Zero-shot Object-Centric Instruction Following: Integrating Foundation Models with Traditional Navigation	Nov 12, 2024	Instruction FollowingObject	—Unverified	0
LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios	Nov 11, 2024	Instruction Following	CodeCode Available	0
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models	Nov 11, 2024	Instruction Following	CodeCode Available	1
Stronger Models are NOT Stronger Teachers for Instruction Tuning	Nov 11, 2024	Instruction Following	—Unverified	0
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory	Nov 11, 2024	Instruction FollowingMinecraft	—Unverified	0
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization	Nov 9, 2024	Instruction Following	—Unverified	0
Fox-1 Technical Report	Nov 8, 2024	2k8k	—Unverified	0
Bayesian Calibration of Win Rate Estimation with LLM Evaluators	Nov 7, 2024	Bayesian InferenceInstruction Following	CodeCode Available	0
Multi-Reward as Condition for Instruction-based Image Editing	Nov 6, 2024	DescriptiveInstruction Following	—Unverified	0

Show:10 25 50

← PrevPage 7 of 23Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AutoIF (Llama3 70B)	Inst-level loose-accuracy	90.4	—	Unverified
2	AutoIF (Qwen2 72B)	Inst-level loose-accuracy	88	—	Unverified
3	GPT-4	Inst-level loose-accuracy	85.37	—	Unverified
4	PaLM 2 S	Inst-level loose-accuracy	59.11	—	Unverified