| CLIP-ReID: Exploiting Vision-Language Model for Image Re-Identification without Concrete Text Labels | Nov 25, 2022 | image-classificationImage Classification | CodeCode Available | 2 |
| ClipCap: CLIP Prefix for Image Captioning | Nov 18, 2021 | Image CaptioningLanguage Modeling | CodeCode Available | 2 |
| Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs | Dec 2, 2024 | AllLanguage Modeling | CodeCode Available | 2 |
| A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity | Jan 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GPT Understands, Too | Mar 18, 2021 | Knowledge ProbingLanguage Modeling | CodeCode Available | 2 |
| LingoQA: Visual Question Answering for Autonomous Driving | Dec 21, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval | Jul 2, 2023 | Biomedical Information RetrievalContrastive Learning | CodeCode Available | 2 |
| Granite Guardian | Dec 10, 2024 | HallucinationLanguage Modeling | CodeCode Available | 2 |
| GPT-Driver: Learning to Drive with GPT | Oct 2, 2023 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| Language Model Powered Digital Biology with BRAD | Sep 4, 2024 | ChatbotCode Generation | CodeCode Available | 2 |