TIIF-Bench: How Does Your T2I Model Follow Your Instructions? Jun 2, 2025 Benchmarking Instruction Following
— Unverified 0Towards LLM-guided Causal Explainability for Black-box Text Classifiers Sep 23, 2023 counterfactual Counterfactual Explanation
— Unverified 0LLMs can be easily Confused by Instructional Distractions Feb 5, 2025 Bias Detection Code Generation
— Unverified 0LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Oct 9, 2024 Instruction Following
— Unverified 0LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements Dec 9, 2024 Decision Making Instruction Following
— Unverified 0CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following Jun 14, 2025 Beat Tracking Genre classification
— Unverified 0Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V Apr 16, 2024 Instruction Following Multimodal Reasoning
— Unverified 0CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation Nov 30, 2022 Diversity Instruction Following
— Unverified 0LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model Oct 3, 2024 image-classification Image Classification
— Unverified 0LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation Oct 18, 2023 Caption Generation Instruction Following
— Unverified 0clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents May 31, 2024 Instruction Following
— Unverified 0Long Context Alignment with Short Instructions and Synthesized Positions May 7, 2024 16k Instruction Following
— Unverified 0CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models Feb 20, 2024 Instruction Following
— Unverified 0ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation Dec 15, 2024 Instruction Following
— Unverified 0LongViTU: Instruction Tuning for Long-Form Video Understanding Jan 9, 2025 EgoSchema Form
— Unverified 0ToDi: Token-wise Distillation via Fine-Grained Divergence Control May 22, 2025 Instruction Following Knowledge Distillation
— Unverified 0ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models Nov 5, 2023 Hallucination In-Context Learning
— Unverified 0LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition Jul 9, 2024 Instruction Following Representation Learning
— Unverified 0A Framework for Fine-Tuning LLMs using Heterogeneous Feedback Aug 5, 2024 Instruction Following Text Summarization
— Unverified 0M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation Aug 29, 2024 Instruction Following Medical Report Generation
— Unverified 0Magistral Jun 12, 2025 Instruction Following Reinforcement Learning (RL)
— Unverified 0Active Reasoning in an Open-World Environment Nov 3, 2023 Instruction Following Minecraft
— Unverified 0ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance Apr 23, 2025 Instruction Following SSIM
— Unverified 0ManipLVM-R1: Reinforcement Learning for Reasoning in Embodied Manipulation with Large Vision-Language Models May 22, 2025 Instruction Following reinforcement-learning
— Unverified 0ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning Jul 18, 2023 Instruction Following Language Modeling
— Unverified 0MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy Nov 15, 2023 Instruction Following Language Modeling
— Unverified 0MART: Improving LLM Safety with Multi-round Automatic Red-Teaming Nov 13, 2023 Instruction Following Red Teaming
— Unverified 0MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching Jun 3, 2025 Data Augmentation Instruction Following
— Unverified 0Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning Oct 14, 2023 In-Context Learning Instruction Following
— Unverified 0MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records Aug 27, 2023 2k Instruction Following
— Unverified 0ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Jul 19, 2024 4k 8k
— Unverified 0MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation Dec 4, 2023 Instruction Following Language Modeling
— Unverified 0Ask, Fail, Repeat: Meeseeks, an Iterative Feedback Benchmark for LLMs' Multi-turn Instruction-Following Ability Apr 30, 2025 Instruction Following Intent Recognition
— Unverified 0A Comprehensive Evaluation of Large Language Models on Mental Illnesses in Arabic Context Jan 12, 2025 Binary Classification Diagnostic
— Unverified 0ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models Mar 29, 2023 Instruction Following
— Unverified 0MetaMorph: Multimodal Understanding and Generation via Instruction Tuning Dec 18, 2024 Instruction Following MORPH
— Unverified 0ChartMind: A Comprehensive Benchmark for Complex Real-world Multimodal Chart Question Answering May 29, 2025 Chart Question Answering Chart Understanding
— Unverified 0Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers May 19, 2025 In-Context Learning Instruction Following
— Unverified 0Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization Oct 12, 2023 Instruction Following
— Unverified 0MIDB: Multilingual Instruction Data Booster for Enhancing Multilingual Instruction Synthesis May 23, 2025 Instruction Following Machine Translation
— Unverified 0A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Instruction Following Large Language Model Apr 17, 2023 Instruction Following Language Modeling
— Unverified 0Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models Jan 10, 2025 Form Image Comprehension
— Unverified 0Mimicking User Data: On Mitigating Fine-Tuning Risks in Closed Large Language Models Jun 12, 2024 Instruction Following Safety Alignment
— Unverified 0Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code Apr 23, 2025 Instruction Following Privacy Preserving
— Unverified 0MiningGPT -- A Domain-Specific Large Language Model for the Mining Industry Dec 2, 2024 Instruction Following Language Modeling
— Unverified 0MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Jan 10, 2025 Instruction Following Language Modeling
— Unverified 0Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning Mar 15, 2024 Hallucination Instruction Following
— Unverified 0Mitigating the Influence of Distractor Tasks in LMs with Prior-Aware Decoding Jan 31, 2024 Instruction Following
— Unverified 0Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning Dec 19, 2023 Diversity Instruction Following
— Unverified 0Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization Jun 16, 2025 Causal Language Modeling Instruction Following
— Unverified 0