LLaMA: Open and Efficient Foundation Language Models Feb 27, 2023 Arithmetic Reasoning Code Generation
Code Code Available 7GPT-4 Technical Report Mar 15, 2023 answerability prediction Arithmetic Reasoning
Code Code Available 6MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments Feb 1, 2024 Embodied Question Answering Language Modeling
Code Code Available 5Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively Jan 5, 2024 image-classification Image Classification
Code Code Available 5Zephyr: Direct Distillation of LM Alignment Oct 25, 2023 2D Cyclist Detection Few-Shot Learning
Code Code Available 5ImageBind: One Embedding Space To Bind Them All May 9, 2023 All Cross-Modal Retrieval
Code Code Available 5Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese Nov 2, 2022 Contrastive Learning image-classification
Code Code Available 5Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning May 23, 2025 Decoder Image Captioning
Code Code Available 4FG-CLIP: Fine-Grained Visual and Textual Alignment May 8, 2025 Image-text Retrieval object-detection
Code Code Available 4A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges Jan 4, 2025 Fairness Hallucination
Code Code Available 4Multimodal Whole Slide Foundation Model for Pathology Nov 29, 2024 Cross-Modal Retrieval model
Code Code Available 4Zero-shot forecasting of chaotic systems Sep 24, 2024 Attribute In-Context Learning
Code Code Available 4Multi-label Cluster Discrimination for Visual Representation Learning Jul 24, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 4Long-CLIP: Unlocking the Long-Text Capability of CLIP Mar 22, 2024 Image Generation Image Retrieval
Code Code Available 4MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Nov 27, 2023 Articles Conditional Text Generation
Code Code Available 4Time-LLM: Time Series Forecasting by Reprogramming Large Language Models Oct 3, 2023 Time Series Time Series Forecasting
Code Code Available 4The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot Jun 29, 2023 Image Segmentation Semantic Segmentation
Code Code Available 4Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective Oct 16, 2022 Coreference Resolution Multiple-choice
Code Code Available 4Flamingo: a Visual Language Model for Few-Shot Learning Apr 29, 2022 Few-Shot Learning Generative Visual Question Answering
Code Code Available 4AnyGraph: Graph Foundation Model in the Wild Aug 20, 2024 Graph Learning Mixture-of-Experts
Code Code Available 3Description Boosting for Zero-Shot Entity and Relation Classification Jun 4, 2024 Relation Relation Classification
Code Code Available 3Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters Mar 18, 2024 Continual Learning Incremental Learning
Code Code Available 3LLM-Pruner: On the Structural Pruning of Large Language Models May 19, 2023 Text Generation zero-shot-classification
Code Code Available 3Finetuned Language Models Are Zero-Shot Learners Sep 3, 2021 ARC Common Sense Reasoning
Code Code Available 3Language Models are Few-Shot Learners May 28, 2020 answerability prediction Articles
Code Code Available 3MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories Jun 5, 2025 Benchmarking Optical Character Recognition
Code Code Available 2GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language Models May 30, 2025 Classification Disaster Response
Code Code Available 2Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner May 16, 2025 Cross-Modal Retrieval Diagnostic
Code Code Available 2SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency Mar 31, 2025 Zero-Shot Learning
Code Code Available 2DiffCLIP: Differential Attention Meets CLIP Mar 9, 2025 Language Modeling Language Modelling
Code Code Available 2Audio-FLAN: A Preliminary Release Feb 23, 2025 Zero-Shot Learning
Code Code Available 2Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding Jan 24, 2025 Anatomy Contrastive Learning
Code Code Available 2BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Jan 13, 2025 Articles Image-text Retrieval
Code Code Available 2CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation Nov 15, 2024 Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation
Code Code Available 2Boosting Vision-Language Models for Histopathology Classification: Predict all at once Sep 3, 2024 All zero-shot-classification
Code Code Available 2Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification Sep 1, 2024 Scene Classification Transductive Zero-Shot Classification
Code Code Available 2MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale Aug 29, 2024 Deep Reinforcement Learning Imitation Learning
Code Code Available 2LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings Aug 25, 2024 Language Modelling Link Prediction
Code Code Available 2EasyRec: Simple yet Effective Language Models for Recommendation Aug 16, 2024 Collaborative Filtering Contrastive Learning
Code Code Available 2ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation Jul 19, 2024 Decoder Image Segmentation
Code Code Available 2FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models Jul 1, 2024 Benchmarking Fairness
Code Code Available 2Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP Jun 25, 2024 cross-modal alignment Image Classification
Code Code Available 2RWKV-CLIP: A Robust Vision-Language Representation Learner Jun 11, 2024 Image-text Retrieval Representation Learning
Code Code Available 2CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation Apr 30, 2024 Mamba State Space Models
Code Code Available 2Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation Apr 9, 2024 Image Segmentation Medical Image Segmentation
Code Code Available 2RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition Mar 20, 2024 Contrastive Learning Fine-Grained Visual Recognition
Code Code Available 2OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments Mar 14, 2024 Zero-Shot Learning
Code Code Available 2Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement Mar 11, 2024 Clinical Knowledge Descriptive
Code Code Available 2CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification Feb 27, 2024 Classification Diagnostic
Code Code Available 2Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models Feb 19, 2024 Adversarial Defense Multimodal Deep Learning
Code Code Available 2