| Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks | Nov 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Large Language Model with Region-guided Referring and Grounding for CT Report Generation | Nov 23, 2024 | Computed Tomography (CT)Diagnostic | CodeCode Available | 2 |
| ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data | Nov 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts | Nov 22, 2024 | AI AgentLanguage Modeling | CodeCode Available | 2 |
| GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI | Nov 21, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 2 |
| MC-LLaVA: Multi-Concept Personalized Vision-Language Model | Nov 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| BianCang: A Traditional Chinese Medicine Large Language Model | Nov 17, 2024 | DiagnosticLanguage Modeling | CodeCode Available | 2 |
| GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding | Nov 16, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning | Nov 15, 2024 | Image Quality AssessmentLanguage Modeling | CodeCode Available | 2 |
| LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation | Nov 14, 2024 | Earth ObservationInstruction Following | CodeCode Available | 2 |