| DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models | Jan 4, 2024 | HallucinationSentence | CodeCode Available | 1 |
| Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning | Jan 1, 2024 | Representation LearningRetrieval | CodeCode Available | 1 |
| Gaussian Mixture Proposals with Pull-Push Learning Scheme to Capture Diverse Events for Weakly Supervised Temporal Video Grounding | Dec 27, 2023 | SentenceTemporal Sentence Grounding | CodeCode Available | 1 |
| Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval | Dec 19, 2023 | cross-modal alignmentMoment Retrieval | CodeCode Available | 1 |
| Mask Grounding for Referring Image Segmentation | Dec 19, 2023 | cross-modal alignmentImage Segmentation | CodeCode Available | 1 |
| Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models | Dec 15, 2023 | Image CaptioningIn-Context Learning | CodeCode Available | 1 |
| Dense X Retrieval: What Retrieval Granularity Should We Use? | Dec 11, 2023 | RetrievalSentence | CodeCode Available | 1 |
| NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations | Dec 11, 2023 | Autonomous DrivingDescriptive | CodeCode Available | 1 |
| BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos | Nov 30, 2023 | Moment RetrievalNatural Language Moment Retrieval | CodeCode Available | 1 |
| Contrastive Vision-Language Alignment Makes Efficient Instruction Learner | Nov 29, 2023 | Contrastive LearningImage Captioning | CodeCode Available | 1 |