| Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training | Jan 4, 2024 | DescriptiveImage Captioning | CodeCode Available | 1 | 5 |
| Mixture of Low-rank Experts for Transferable AI-Generated Image Detection | Apr 7, 2024 | Descriptiveparameter-efficient fine-tuning | CodeCode Available | 1 | 5 |
| A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding | Jun 8, 2024 | DescriptiveLanguage Modelling | CodeCode Available | 1 | 5 |
| Deep Implicit Statistical Shape Models for 3D Medical Image Delineation | Apr 7, 2021 | DescriptiveLiver Segmentation | CodeCode Available | 1 | 5 |
| A Bi-directional Transformer for Musical Chord Recognition | Jul 5, 2019 | Chord RecognitionDescriptive | CodeCode Available | 1 | 5 |
| Deep learning based geometric registration for medical images: How accurate can we get without visual features? | Mar 1, 2021 | DecoderDescriptive | CodeCode Available | 1 | 5 |
| Can Machines Learn Morality? The Delphi Experiment | Oct 14, 2021 | DescriptiveEthics | CodeCode Available | 1 | 5 |
| Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models | May 5, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 | 5 |
| A Foundation Language-Image Model of the Retina (FLAIR): Encoding Expert Knowledge in Text Supervision | Aug 15, 2023 | DescriptiveLanguage Modelling | CodeCode Available | 1 | 5 |
| JAMMIN-GPT: Text-based Improvisation using LLMs in Ableton Live | Dec 6, 2023 | Descriptive | CodeCode Available | 1 | 5 |
| IDAS: Intent Discovery with Abstractive Summarization | May 31, 2023 | Abstractive Text SummarizationDescriptive | CodeCode Available | 1 | 5 |
| CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions | Dec 8, 2020 | counterfactualDescriptive | CodeCode Available | 1 | 5 |
| DFR: Deep Feature Reconstruction for Unsupervised Anomaly Segmentation | Dec 13, 2020 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 1 | 5 |
| TV-SAM: Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation | Feb 24, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 | 5 |
| Text-Guided Neural Image Inpainting | Apr 7, 2020 | DescriptiveImage Generation | CodeCode Available | 1 | 5 |
| Distilling BlackBox to Interpretable models for Efficient Transfer Learning | May 26, 2023 | DescriptiveTransfer Learning | CodeCode Available | 1 | 5 |
| DEER: Descriptive Knowledge Graph for Explaining Entity Relationships | May 21, 2022 | BIG-bench Machine LearningDescriptive | CodeCode Available | 1 | 5 |
| CiteTracker: Correlating Image and Text for Visual Tracking | Aug 22, 2023 | AttributeDescriptive | CodeCode Available | 1 | 5 |
| A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation | May 29, 2024 | Autonomous DrivingBoundary Detection | CodeCode Available | 1 | 5 |
| ANNdotNET -- deep learning tool on .NET Platform | Sep 23, 2020 | Deep LearningDescriptive | CodeCode Available | 1 | 5 |
| Driving Style Recognition Using Interval Type-2 Fuzzy Inference System and Multiple Experts Decision Making | Oct 26, 2021 | Decision MakingDescriptive | CodeCode Available | 1 | 5 |
| DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | Jun 21, 2025 | Autonomous DrivingDescriptive | CodeCode Available | 1 | 5 |
| EgoTaskQA: Understanding Human Tasks in Egocentric Videos | Oct 8, 2022 | Action Localizationcounterfactual | CodeCode Available | 1 | 5 |
| Dual-Level Collaborative Transformer for Image Captioning | Jan 16, 2021 | DescriptiveImage Captioning | CodeCode Available | 1 | 5 |
| Enhancing Monocular 3D Scene Completion with Diffusion Model | Mar 2, 2025 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 1 | 5 |
| PDNS-Net: A Large Heterogeneous Graph Benchmark Dataset of Network Resolutions for Graph Learning | Mar 15, 2022 | ClassificationDescriptive | CodeCode Available | 1 | 5 |
| Confidence-aware Pseudo-label Learning for Weakly Supervised Visual Grounding | Jan 1, 2023 | DescriptiveObject | CodeCode Available | 1 | 5 |
| Ins-HOI: Instance Aware Human-Object Interactions Recovery | Dec 15, 2023 | DescriptiveDisentanglement | CodeCode Available | 1 | 5 |
| Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning | Feb 22, 2023 | AttributeDescriptive | CodeCode Available | 1 | 5 |
| Causal Modeling of Twitter Activity During COVID-19 | May 16, 2020 | Causal InferenceDescriptive | CodeCode Available | 1 | 5 |
| HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation | Aug 15, 2021 | DescriptiveEntity Alignment | CodeCode Available | 1 | 5 |
| A Linear Time and Space Local Point Cloud Geometry Encoder via Vectorized Kernel Mixture (VecKM) | Apr 2, 2024 | Descriptive | CodeCode Available | 1 | 5 |
| CENet: Toward Concise and Efficient LiDAR Semantic Segmentation for Autonomous Driving | Jul 26, 2022 | 3D Semantic SegmentationAutonomous Driving | CodeCode Available | 1 | 5 |
| High-Fidelity 3D Face Generation from Natural Language Descriptions | May 5, 2023 | DescriptiveFace Generation | CodeCode Available | 1 | 5 |
| Human-like Controllable Image Captioning with Verb-specific Semantic Roles | Mar 22, 2021 | Caption Generationcontrollable image captioning | CodeCode Available | 1 | 5 |
| Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation | Aug 24, 2023 | cross-modal alignmentDescriptive | CodeCode Available | 1 | 5 |
| Can Knowledge Graphs Simplify Text? | Aug 14, 2023 | DescriptiveKG-to-Text Generation | CodeCode Available | 1 | 5 |
| Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models | May 11, 2025 | DescriptiveDiagnostic | CodeCode Available | 1 | 5 |
| GraphLIME: Local Interpretable Model Explanations for Graph Neural Networks | Jan 17, 2020 | Descriptivefeature selection | CodeCode Available | 1 | 5 |
| GOAL: Global-local Object Alignment Learning | Mar 22, 2025 | DescriptiveObject | CodeCode Available | 1 | 5 |
| GL-RG: Global-Local Representation Granularity for Video Captioning | May 22, 2022 | Caption GenerationDescriptive | CodeCode Available | 1 | 5 |
| Graph Backdoor | Jun 21, 2020 | Backdoor AttackDescriptive | CodeCode Available | 1 | 5 |
| GraphXAIN: Narratives to Explain Graph Neural Networks | Nov 4, 2024 | DescriptiveFeature Importance | CodeCode Available | 1 | 5 |
| HDCC: A Hyperdimensional Computing compiler for classification on embedded systems and high-performance computing | Apr 24, 2023 | C++ codeDescriptive | CodeCode Available | 1 | 5 |
| Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books | Jun 22, 2015 | DescriptiveDiversity | CodeCode Available | 1 | 5 |
| Natural scene reconstruction from fMRI signals using generative latent diffusion | Mar 9, 2023 | Brain Computer InterfaceBrain Decoding | CodeCode Available | 1 | 5 |
| Aligning LLM Agents by Learning Latent Preference from User Edits | Apr 23, 2024 | DescriptiveLanguage Modelling | CodeCode Available | 1 | 5 |
| A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D Faces | Jun 6, 2020 | DescriptiveFace Model | CodeCode Available | 1 | 5 |
| Hybrid Symbolic-Numeric Library for Power System Modeling and Analysis | Feb 21, 2020 | Descriptive | CodeCode Available | 1 | 5 |
| A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks | May 19, 2020 | Descriptive | CodeCode Available | 1 | 5 |