| User-Friendly Customized Generation with Multi-Modal Prompts | May 26, 2024 | DescriptiveImage Generation | CodeCode Available | 1 |
| Benchmarking Hierarchical Image Pyramid Transformer for the classification of colon biopsies and polyps in histopathology images | May 24, 2024 | BenchmarkingClassification | —Unverified | 0 |
| Composed Image Retrieval for Remote Sensing | May 24, 2024 | Composed Image Retrieval (CoIR)Descriptive | CodeCode Available | 2 |
| Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports | May 23, 2024 | Clinical KnowledgeDescriptive | —Unverified | 0 |
| Accelerated Evaluation of Ollivier-Ricci Curvature Lower Bounds: Bridging Theory and Computation | May 22, 2024 | Descriptive | —Unverified | 0 |
| Peripheral Nervous System Responses to Food Stimuli: Analysis Using Data Science Approaches | May 21, 2024 | DescriptiveSubgroup Discovery | —Unverified | 0 |
| Could a Computer Architect Understand our Brain? | May 21, 2024 | DescriptiveERP | —Unverified | 0 |
| Towards a Framework for Openness in Foundation Models: Proceedings from the Columbia Convening on Openness in Artificial Intelligence | May 17, 2024 | Descriptive | —Unverified | 0 |
| A Deep Learning Approach to Heterogeneous Consumer Aesthetics in Retail Fashion | May 17, 2024 | Deep LearningDescriptive | —Unverified | 0 |
| Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots | May 13, 2024 | Code GenerationDescriptive | —Unverified | 0 |
| Analysis and prevention of AI-based phishing email attacks | May 8, 2024 | Descriptive | —Unverified | 0 |
| Remote Diffusion | May 7, 2024 | DescriptiveRAG | —Unverified | 0 |
| Time Series Stock Price Forecasting Based on Genetic Algorithm (GA)-Long Short-Term Memory Network (LSTM) Optimization | May 6, 2024 | DescriptiveStock Price Prediction | —Unverified | 0 |
| Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models | May 5, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| SkelCap: Automated Generation of Descriptive Text from Skeleton Keypoint Sequences | May 5, 2024 | Descriptive | —Unverified | 0 |
| FITA: Fine-grained Image-Text Aligner for Radiology Report Generation | May 2, 2024 | DescriptiveTriplet | —Unverified | 0 |
| CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions | May 1, 2024 | DescriptiveLanguage Modeling | —Unverified | 0 |
| Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model | Apr 30, 2024 | DescriptiveGesture Generation | —Unverified | 0 |
| Análise de ambiguidade linguística em modelos de linguagem de grande escala (LLMs) | Apr 25, 2024 | Descriptive | —Unverified | 0 |
| Aligning LLM Agents by Learning Latent Preference from User Edits | Apr 23, 2024 | DescriptiveLanguage Modelling | CodeCode Available | 1 |
| A Survey of Decomposition-Based Evolutionary Multi-Objective Optimization: Part II -- A Data Science Perspective | Apr 22, 2024 | AnatomyDescriptive | —Unverified | 0 |
| Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images | Apr 21, 2024 | Descriptive | —Unverified | 0 |
| ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis | Apr 15, 2024 | DescriptiveImage Captioning | CodeCode Available | 0 |
| Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation | Apr 15, 2024 | Contrastive LearningDescriptive | CodeCode Available | 3 |
| TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning | Apr 14, 2024 | Dense Video CaptioningDescriptive | CodeCode Available | 2 |