| MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | Aug 16, 2023 | Motion Expressions Guided Video SegmentationObject | CodeCode Available | 2 |
| MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection | Mar 4, 2024 | GPUMamba | CodeCode Available | 2 |
| CVSS Corpus and Massively Multilingual Speech-to-Speech Translation | Jan 11, 2022 | SentenceSpeech-to-Speech Translation | CodeCode Available | 2 |
| BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs | Jul 17, 2023 | Instruction FollowingSentence | CodeCode Available | 2 |
| MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval | Jul 2, 2023 | Biomedical Information RetrievalContrastive Learning | CodeCode Available | 2 |
| BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model | Aug 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric | Dec 16, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing | Feb 21, 2022 | Few-Shot LearningSentence | CodeCode Available | 2 |
| AutoRE: Document-Level Relation Extraction with Large Language Models | Mar 21, 2024 | Document-level Relation ExtractionRelation | CodeCode Available | 2 |
| Abstractive Summarization of Spoken andWritten Instructions with BERT | Aug 21, 2020 | Abstractive Text SummarizationArticles | CodeCode Available | 2 |