| Improved Few-Shot Image Classification Through Multiple-Choice Questions | Jul 23, 2024 | ArticlesFew-Shot Image Classification | —Unverified | 0 |
| Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models | Jul 23, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |
| MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity | Jul 22, 2024 | DiversityMultiple-choice | CodeCode Available | 2 |
| Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing | Jul 22, 2024 | AllDiversity | CodeCode Available | 1 |
| LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding | Jul 22, 2024 | Multiple-choiceQuestion Answering | CodeCode Available | 2 |
| Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions | Jul 21, 2024 | Multiple-choiceMultiple Choice Question Answering (MCQA) | —Unverified | 0 |
| MIBench: Evaluating Multimodal Large Language Models over Multiple Images | Jul 21, 2024 | In-Context LearningMultiple-choice | —Unverified | 0 |
| Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment | Jul 20, 2024 | Contrastive LearningMultiple-choice | CodeCode Available | 0 |
| Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data | Jul 20, 2024 | Language ModellingMachine Translation | —Unverified | 0 |
| Evaluating language models as risk scores | Jul 19, 2024 | Multiple-choiceQuestion Answering | CodeCode Available | 1 |