| Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models | Mar 7, 2024 | Sentence | CodeCode Available | 0 |
| Designing Informative Metrics for Few-Shot Example Selection | Mar 6, 2024 | Few-Shot Learningfew-shot-ner | —Unverified | 0 |
| Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights | Mar 6, 2024 | Boundary DetectionSentence | CodeCode Available | 0 |
| PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion | Mar 6, 2024 | Sentence | CodeCode Available | 0 |
| BiVert: Bidirectional Vocabulary Evaluation using Relations for Machine Translation | Mar 6, 2024 | Machine TranslationNMT | —Unverified | 0 |
| MeaCap: Memory-Augmented Zero-shot Image Captioning | Mar 6, 2024 | Caption GenerationImage Captioning | CodeCode Available | 2 |
| Bridging Language and Items for Retrieval and Recommendation | Mar 6, 2024 | RetrievalSentence | CodeCode Available | 3 |
| Japanese-English Sentence Translation Exercises Dataset for Automatic Grading | Mar 6, 2024 | Few-Shot LearningIn-Context Learning | —Unverified | 0 |
| Exploring Naive Approaches to Tell Apart LLMs Productions from Human-written Text | Mar 5, 2024 | LLM-generated Text DetectionSentence | CodeCode Available | 0 |
| Revisiting Meta-evaluation for Grammatical Error Correction | Mar 5, 2024 | Grammatical Error CorrectionSentence | CodeCode Available | 0 |