MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs Jan 29, 2025 All Instruction Following
Code Code Available 25 NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models May 26, 2023 Instruction Following Vision and Language Navigation
Code Code Available 25 Precise Zero-Shot Dense Retrieval without Relevance Labels Dec 20, 2022 Fact Verification Instruction Following
Code Code Available 25 mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval Jan 31, 2025 Instruction Following Retrieval
Code Code Available 25 MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models Sep 24, 2023 Instruction Following
Code Code Available 25 GSCo: Towards Generalizable AI in Medicine via Generalist-Specialist Collaboration Apr 23, 2024 Collaborative Inference In-Context Learning
Code Code Available 25 Long-Context Language Modeling with Parallel Context Encoding Feb 26, 2024 In-Context Learning Instruction Following
Code Code Available 25 LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models Jun 15, 2023 Hallucination Image Captioning
Code Code Available 25 CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts May 9, 2024 Image Captioning Instruction Following
Code Code Available 25 Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks Jul 3, 2025 Instruction Following
Code Code Available 25 MM-IFEngine: Towards Multimodal Instruction Following Apr 10, 2025 Instruction Following
Code Code Available 25 BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models Jun 19, 2023 Instruction Following Text Generation
Code Code Available 25 MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding Jan 1, 2024 Autonomous Driving Instruction Following
Code Code Available 25 ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers Sep 28, 2023 GPU Instruction Following
Code Code Available 25 MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control Mar 18, 2024 Instruction Following Minecraft
Code Code Available 25 Benchmarking Complex Instruction-Following with Multiple Constraints Composition Jul 4, 2024 Benchmarking Instruction Following
Code Code Available 25 Autonomous Improvement of Instruction Following Skills via Foundation Models Jul 30, 2024 Image Generation Instruction Following
Code Code Available 25 Aligning Modalities in Vision Large Language Models via Preference Fine-tuning Feb 18, 2024 Hallucination Instruction Following
Code Code Available 25 ExpertPrompting: Instructing Large Language Models to be Distinguished Experts May 24, 2023 In-Context Learning Instruction Following
Code Code Available 25 Open6DOR: Benchmarking Open-instruction 6-DoF Object Rearrangement and A VLM-based Approach Oct 24, 2024 Benchmarking Instruction Following
Code Code Available 25 LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding Jun 29, 2023 16k Image Captioning
Code Code Available 25 Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following Feb 9, 2024 Autonomous Driving Denoising
Code Code Available 25 FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets Jul 20, 2023 Instruction Following Language Model Evaluation
Code Code Available 25 A Critical Evaluation of AI Feedback for Aligning Large Language Models Feb 19, 2024 Instruction Following reinforcement-learning
Code Code Available 25 LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts Dec 16, 2024 General Knowledge Instruction Following
Code Code Available 25 LMDrive: Closed-Loop End-to-End Driving with Large Language Models Dec 12, 2023 Autonomous Driving Instruction Following
Code Code Available 25 AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks Mar 2, 2024 Instruction Following LLM real-life tasks
Code Code Available 25 LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning Mar 19, 2025 Instruction Following Multimodal Reasoning
Code Code Available 25 Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning Dec 22, 2023 Instruction Following Mixture-of-Experts
Code Code Available 25 CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design Apr 3, 2025 Band Gap Dielectric Constant
Code Code Available 25 LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Nov 9, 2023 Instruction Following LLM real-life tasks
Code Code Available 25 LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action Jul 10, 2022 Instruction Following Language Modeling
Code Code Available 25 Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Jan 29, 2025 Instruction Following Math
Code Code Available 25 CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning Jun 7, 2024 Instruction Following Math
Code Code Available 25 Lion: Adversarial Distillation of Proprietary Large Language Models May 22, 2023 Instruction Following Knowledge Distillation
Code Code Available 25 Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models Oct 23, 2024 Instruction Following Language Modelling
Code Code Available 25 LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation Nov 14, 2024 Earth Observation Instruction Following
Code Code Available 25 LITA: Language Instructed Temporal-Localization Assistant Mar 27, 2024 Instruction Following Temporal Localization
Code Code Available 25 Learning to Decode Collaboratively with Multiple Language Models Mar 6, 2024 Instruction Following
Code Code Available 25 EmoLLMs: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective Analysis Jan 16, 2024 Instruction Following regression
Code Code Available 25 Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions Aug 8, 2023 Caption Generation Image Captioning
Code Code Available 25 AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension Feb 12, 2024 2k Automatic Speech Recognition
Code Code Available 25 Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch Nov 6, 2023 Decoder GSM8K
Code Code Available 25 Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic Feb 19, 2024 Instruction Following Math
Code Code Available 25 LLark: A Multimodal Instruction-Following Language Model for Music Oct 11, 2023 Instruction Following Language Modeling
Code Code Available 25 EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain Jan 30, 2024 Image Comprehension Instruction Following
Code Code Available 25 EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce Aug 14, 2023 Diversity Instruction Following
Code Code Available 25 Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models Apr 3, 2024 Instruction Following
Code Code Available 25 Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems Feb 26, 2025 Instruction Following
Code Code Available 25 DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Jul 15, 2025 Benchmarking Instruction Following
Code Code Available 25