| Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection | Jul 25, 2023 | Human-Object Interaction DetectionSentence | —Unverified | 0 |
| 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding | Jul 25, 2023 | 3D visual groundingObject | —Unverified | 0 |
| Enhancing image captioning with depth information using a Transformer-based framework | Jul 24, 2023 | Image CaptioningImage Paragraph Captioning | —Unverified | 0 |
| PRIOR: Prototype Representation Joint Learning from Medical Images and Reports | Jul 24, 2023 | Contrastive LearningImage to text | CodeCode Available | 1 |
| Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework | Jul 24, 2023 | Contrastive LearningMultimodal Reasoning | CodeCode Available | 1 |
| Transformer-based Joint Source Channel Coding for Textual Semantic Communication | Jul 23, 2023 | Semantic CommunicationSemantic Similarity | —Unverified | 0 |
| Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models | Jul 22, 2023 | ArticlesClassification | CodeCode Available | 0 |
| Explainable Topic-Enhanced Argument Mining from Heterogeneous Sources | Jul 22, 2023 | Argument MiningLanguage Modelling | —Unverified | 0 |
| MeetEval: A Toolkit for Computation of Word Error Rates for Meeting Transcription Systems | Jul 21, 2023 | Sentence | CodeCode Available | 1 |
| Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models | Jul 20, 2023 | NegationRetrieval | —Unverified | 0 |