| RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale | Jun 24, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| An Enhanced Fake News Detection System With Fuzzy Deep Learning | Jun 24, 2024 | Deep LearningFact Checking | CodeCode Available | 1 |
| INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness | Jun 23, 2024 | Code GenerationNavigate | CodeCode Available | 1 |
| EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy | Jun 19, 2024 | Exposure CorrectionImage Enhancement | CodeCode Available | 1 |
| Look Further Ahead: Testing the Limits of GPT-4 in Path Planning | Jun 17, 2024 | Navigate | CodeCode Available | 1 |
| OoDIS: Anomaly Instance Segmentation Benchmark | Jun 17, 2024 | Anomaly Instance SegmentationAnomaly Segmentation | CodeCode Available | 1 |
| SemanticSpray++: A Multimodal Dataset for Autonomous Driving in Wet Surface Conditions | Jun 14, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 |
| SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature | Jun 10, 2024 | Claim VerificationInstruction Following | CodeCode Available | 1 |
| MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows | Jun 10, 2024 | Navigate | CodeCode Available | 1 |
| Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts | Jun 4, 2024 | NavigateVision and Language Navigation | CodeCode Available | 1 |