| Comprehending and Ordering Semantics for Image Captioning | Jun 14, 2022 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 2 | 5 |
| Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation | Apr 4, 2024 | Contrastive LearningReferring Expression | CodeCode Available | 2 | 5 |
| Aligning and Prompting Everything All at Once for Universal Visual Perception | Dec 4, 2023 | AllObject | CodeCode Available | 2 | 5 |
| ALBERT: A Lite BERT for Self-supervised Learning of Language Representations | Sep 26, 2019 | Common Sense ReasoningGPU | CodeCode Available | 2 | 5 |
| DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory | Oct 10, 2024 | Document TranslationMachine Translation | CodeCode Available | 2 | 5 |
| Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography | May 20, 2024 | Breast Cancer DetectionDiversity | CodeCode Available | 2 | 5 |
| MeaCap: Memory-Augmented Zero-shot Image Captioning | Mar 6, 2024 | Caption GenerationImage Captioning | CodeCode Available | 2 | 5 |
| MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | Aug 16, 2023 | Motion Expressions Guided Video SegmentationObject | CodeCode Available | 2 | 5 |
| MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction | Apr 23, 2022 | Grammatical Error CorrectionSentence | CodeCode Available | 2 | 5 |
| NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality | May 9, 2022 | SentenceSpeech Synthesis | CodeCode Available | 2 | 5 |
| PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification | Aug 30, 2019 | Paraphrase IdentificationSentence | CodeCode Available | 2 | 5 |
| ANAH: Analytical Annotation of Hallucinations in Large Language Models | May 30, 2024 | Generative Question AnsweringHallucination | CodeCode Available | 2 | 5 |
| CCTC: A Cross-Sentence Chinese Text Correction Dataset for Native Speakers | Oct 1, 2022 | Grammatical Error CorrectionSentence | CodeCode Available | 2 | 5 |
| BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric | Dec 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 | 5 |
| MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval | Jul 2, 2023 | Biomedical Information RetrievalContrastive Learning | CodeCode Available | 2 | 5 |
| CLUE: A Chinese Language Understanding Evaluation Benchmark | Apr 13, 2020 | General ClassificationMachine Reading Comprehension | CodeCode Available | 2 | 5 |
| AutoRE: Document-Level Relation Extraction with Large Language Models | Mar 21, 2024 | Document-level Relation ExtractionRelation | CodeCode Available | 2 | 5 |
| beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems | Sep 16, 2024 | Collaborative FilteringRecommendation Systems | CodeCode Available | 2 | 5 |
| BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs | Jul 17, 2023 | Instruction FollowingSentence | CodeCode Available | 2 | 5 |
| CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing | Feb 21, 2022 | Few-Shot LearningSentence | CodeCode Available | 2 | 5 |
| Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations | Feb 20, 2024 | Sentence | CodeCode Available | 2 | 5 |
| Compositional Visual Generation with Composable Diffusion Models | Jun 3, 2022 | Sentence | CodeCode Available | 2 | 5 |
| ARAGOG: Advanced RAG Output Grading | Apr 1, 2024 | Document EmbeddingLanguage Modeling | CodeCode Available | 2 | 5 |
| CVSS Corpus and Massively Multilingual Speech-to-Speech Translation | Jan 11, 2022 | SentenceSpeech-to-Speech Translation | CodeCode Available | 2 | 5 |
| Active Retrieval Augmented Generation | May 11, 2023 | RetrievalRetrieval-augmented Generation | CodeCode Available | 2 | 5 |