| PRISM: Progressive Restoration for Scene Graph-based Image Manipulation | Nov 3, 2023 | DenoisingDescriptive | —Unverified | 0 |
| FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models | Nov 2, 2023 | DescriptiveInstruction Following | CodeCode Available | 1 |
| A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis | Oct 31, 2023 | DescriptiveMedical Image Analysis | —Unverified | 0 |
| Inverse Decision Modeling: Learning Interpretable Representations of Behavior | Oct 28, 2023 | Descriptive | —Unverified | 0 |
| Online Decision Mediation | Oct 28, 2023 | Decision MakingDescriptive | —Unverified | 0 |
| Matching of Descriptive Labels to Glossary Descriptions | Oct 27, 2023 | DescriptiveSTS | —Unverified | 0 |
| Utilizing Language Models for Energy Load Forecasting | Oct 26, 2023 | Decision MakingDescriptive | CodeCode Available | 0 |
| This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models | Oct 24, 2023 | DescriptiveNegation | CodeCode Available | 1 |
| Videoprompter: an ensemble of foundational models for zero-shot video understanding | Oct 23, 2023 | Action RecognitionDescriptive | —Unverified | 0 |
| Semi-supervised multimodal coreference resolution in image narrations | Oct 20, 2023 | coreference-resolutionCoreference Resolution | CodeCode Available | 0 |