| Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension | Dec 4, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| GraphXAIN: Narratives to Explain Graph Neural Networks | Nov 4, 2024 | DescriptiveFeature Importance | CodeCode Available | 1 |
| SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments | Oct 23, 2024 | DescriptiveSentiment Analysis | CodeCode Available | 1 |
| Scene Graph Generation with Role-Playing Large Language Models | Oct 20, 2024 | DescriptiveGraph Generation | CodeCode Available | 1 |
| Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval | Oct 4, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds | Sep 13, 2024 | Audio ClassificationDescriptive | CodeCode Available | 1 |
| RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models | Aug 27, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization | Aug 26, 2024 | DescriptiveImage Captioning | CodeCode Available | 1 |
| Leveraging Large Language Models for Enhancing the Understandability of Generated Unit Tests | Aug 21, 2024 | Bug fixingDescriptive | CodeCode Available | 1 |
| FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant | Aug 19, 2024 | DescriptiveFace Swapping | CodeCode Available | 1 |