| NLPHut’s Participation at WAT2021 | Aug 1, 2021 | Caption GenerationImage Captioning | —Unverified | 0 |
| NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge | Mar 28, 2022 | Caption GenerationObject | —Unverified | 0 |
| O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning | Aug 5, 2021 | AttributeCaption Generation | —Unverified | 0 |
| OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts | Jul 22, 2017 | Caption GenerationDescriptive | —Unverified | 0 |
| PathM3: A Multimodal Multi-Task Multiple Instance Learning Framework for Whole Slide Image Classification and Captioning | Mar 13, 2024 | Caption GenerationDiagnostic | —Unverified | 0 |
| Predicting the Mumble of Wireless Channel with Sequence-to-Sequence Models | Jan 14, 2019 | Caption GenerationLanguage Modeling | —Unverified | 0 |
| Relationship-based Neural Baby Talk | Mar 8, 2021 | Caption GenerationGraph Attention | —Unverified | 0 |
| REST: REtrieve & Self-Train for generative action recognition | Sep 29, 2022 | Action RecognitionCaption Generation | —Unverified | 0 |
| Rethinking the Form of Latent States in Image Captioning | Jul 26, 2018 | Caption GenerationForm | —Unverified | 0 |
| Retrieval-Augmented Multimodal Language Modeling | Nov 22, 2022 | Caption GenerationImage Captioning | —Unverified | 0 |
| Review Networks for Caption Generation | May 25, 2016 | Caption GenerationDecoder | —Unverified | 0 |
| RUC+CMU: System Report for Dense Captioning Events in Videos | Jun 22, 2018 | Caption GenerationDense Captioning | —Unverified | 0 |
| Scene-based Factored Attention for Image Captioning | Aug 7, 2019 | Caption GenerationDecoder | —Unverified | 0 |
| Scene Graph Generation for Better Image Captioning? | Sep 23, 2021 | Caption GenerationGraph Generation | —Unverified | 0 |
| Scene Understanding for Autonomous Manipulation with Deep Learning | Mar 23, 2019 | Action UnderstandingAffordance Detection | —Unverified | 0 |
| See It All: Contextualized Late Aggregation for 3D Dense Captioning | Aug 14, 2024 | 3D dense captioningAll | —Unverified | 0 |
| Seq2Mol: Automatic design of de novo molecules conditioned by the target protein sequences through deep neural networks | Oct 29, 2020 | Caption GenerationLanguage Modelling | —Unverified | 0 |
| Sequence to Sequence - Video to Text | Dec 1, 2015 | Caption GenerationLanguage Modeling | —Unverified | 0 |
| Set Prediction Guided by Semantic Concepts for Diverse Video Captioning | Dec 25, 2023 | Caption GenerationDiversity | —Unverified | 0 |
| Simultaneous Segmentation and Recognition: Towards more accurate Ego Gesture Recognition | Sep 18, 2019 | Activity RecognitionCaption Generation | —Unverified | 0 |
| Skip-Gram − Zipf + Uniform = Vector Additivity | Jul 1, 2017 | Caption GenerationDimensionality Reduction | —Unverified | 0 |
| Social Media Ready Caption Generation for Brands | Jan 3, 2024 | Caption GenerationImage Captioning | —Unverified | 0 |
| Soft + Hardwired Attention: An LSTM Framework for Human Trajectory Prediction and Abnormal Event Detection | Feb 18, 2017 | Caption GenerationEvent Detection | —Unverified | 0 |
| Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning | Feb 27, 2019 | AttributeCaption Generation | —Unverified | 0 |
| Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning | Feb 8, 2023 | Caption GenerationDecoder | —Unverified | 0 |