| VORTEX: A Spatial Computing Framework for Optimized Drone Telemetry Extraction from First-Person View Flight Data | Dec 24, 2024 | Computational EfficiencyOptical Character Recognition | —Unverified | 0 |
| LMV-RPA: Large Model Voting-based Robotic Process Automation | Dec 23, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Deciphering the Underserved: Benchmarking LLM OCR for Low-Resource Scripts | Dec 20, 2024 | BenchmarkingOptical Character Recognition | CodeCode Available | 0 |
| RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages | Dec 14, 2024 | Machine TranslationOptical Character Recognition | CodeCode Available | 0 |
| Advancing Vehicle Plate Recognition: Multitasking Visual Language Models with VehiclePaliGemma | Dec 14, 2024 | GPULicense Plate Recognition | —Unverified | 0 |
| Enhancement of text recognition for hanja handwritten documents of Ancient Korea | Dec 14, 2024 | Data Augmentationobject-detection | —Unverified | 0 |
| DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding | Dec 13, 2024 | Chart UnderstandingMixture-of-Experts | CodeCode Available | 9 |
| POINTS1.5: Building a Vision-Language Model towards Real World Applications | Dec 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Text Change Detection in Multilingual Documents Using Image Comparison | Dec 5, 2024 | BinarizationChange Detection | —Unverified | 0 |
| Aligned Music Notation and Lyrics Transcription | Dec 5, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |