| Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language Models | Oct 15, 2024 | HallucinationLarge Language Model | CodeCode Available | 0 |
| An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents | May 16, 2025 | FormLanguage Modeling | CodeCode Available | 0 |
| Automated title and abstract screening for scoping reviews using the GPT-4 Large Language Model | Nov 14, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| WaterDrum: Watermarking for Data-centric Unlearning Metric | May 8, 2025 | Large Language Model | CodeCode Available | 0 |
| TruthEval: A Dataset to Evaluate LLM Truthfulness and Reliability | Jun 4, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| Conversations in Galician: a Large Language Model for an Underrepresented Language | Nov 7, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Vamos: Versatile Action Models for Video Understanding | Nov 22, 2023 | EgoSchemaHard Attention | CodeCode Available | 0 |
| Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models | Apr 2, 2024 | Distractor GenerationIn-Context Learning | CodeCode Available | 0 |
| PeriGuru: A Peripheral Robotic Mobile App Operation Assistant based on GUI Image Understanding and Prompting with LLM | Sep 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Leveraging Content and Acoustic Representations for Speech Emotion Recognition | Sep 9, 2024 | Emotion RecognitionLanguage Modelling | CodeCode Available | 0 |
| Can a Large Language Model Learn Matrix Functions In Context? | Nov 24, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 0 |
| The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention | Jun 29, 2024 | DiversityImage Generation | CodeCode Available | 0 |
| Can a large language model be a gaslighter? | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Conversational Feedback in Scripted versus Spontaneous Dialogues: A Comparative Analysis | Sep 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Can AI Relate: Testing Large Language Model Response for Mental Health Support | May 20, 2024 | ChatbotLanguage Modeling | CodeCode Available | 0 |
| SemEval-2017 Task 4: Sentiment Analysis in Twitter using BERT | Jan 15, 2024 | Binary ClassificationClassification | CodeCode Available | 0 |
| TULUN: Transparent and Adaptable Low-resource Machine Translation | May 24, 2025 | Domain AdaptationLanguage Modeling | CodeCode Available | 0 |
| A Multi-Pass Large Language Model Framework for Precise and Efficient Radiology Report Error Detection | Jun 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor | Sep 3, 2024 | Backdoor AttackLarge Language Model | CodeCode Available | 0 |
| Personalized LLM for Generating Customized Responses to the Same Query from Different Users | Dec 16, 2024 | Contrastive LearningDiversity | CodeCode Available | 0 |
| Automated Privacy Information Annotation in Large Language Model Interactions | May 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models | Feb 9, 2025 | Answer GenerationLanguage Modeling | CodeCode Available | 0 |
| Let Me Think! A Long Chain-of-Thought Can Be Worth Exponentially Many Short Ones | May 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Exploiting ChatGPT for Diagnosing Autism-Associated Language Disorders and Identifying Distinct Features | May 3, 2024 | DiagnosticLanguage Modelling | CodeCode Available | 0 |
| Scaling Reasoning can Improve Factuality in Large Language Models | May 16, 2025 | Knowledge GraphsLarge Language Model | CodeCode Available | 0 |