GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern Greek Dec 11, 2024 Dependency Parsing Morphological Tagging
Code Code Available 25 Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users May 6, 2022 Transliteration
Code Code Available 25 Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models Jun 28, 2023 Part-Of-Speech Tagging Sentiment Analysis
Code Code Available 15 An Ensemble Model of Word-based and Character-based Models for Japanese and Chinese Input Method Dec 1, 2012 Transliteration
Code Code Available 15 Question Answering Classification for Amharic Social Media Community Based Questions Jun 1, 2022 8k Question Answering
Code Code Available 15 ParaNames: A Massively Multilingual Entity Name Corpus Feb 28, 2022 named-entity-recognition Named Entity Recognition
Code Code Available 15 Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration May 25, 2023 Speech Synthesis text-to-speech
Code Code Available 15 Show Me the World in My Language: Establishing the First Baseline for Scene-Text to Scene-Text Translation Aug 6, 2023 Machine Translation Scene Text Editing
Code Code Available 15 A machine transliteration tool between Uzbek alphabets May 19, 2022 Transliteration
Code Code Available 15 XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages May 19, 2023 In-Context Learning Multilingual NLP
Code Code Available 15 ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using Wikidata May 15, 2024 Multilingual Named Entity Recognition named-entity-recognition
Code Code Available 15 Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset Jul 2, 2020 Language Modeling Language Modelling
Code Code Available 15 Sub-Character Tokenization for Chinese Pretrained Language Models Jun 1, 2021 Chinese Word Segmentation Computational Efficiency
Code Code Available 15 Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus Oct 4, 2020 Articles Machine Translation
Code Code Available 15 DeepScribe: Localization and Classification of Elamite Cuneiform Signs Via Deep Learning Jun 2, 2023 Transliteration
Code Code Available 15 ParsiPy: NLP Toolkit for Historical Persian Texts in Python Mar 22, 2025 Lemmatization Part-Of-Speech Tagging
Code Code Available 15 KLPT – Kurdish Language Processing Toolkit Nov 1, 2020 Diversity Lemmatization
Code Code Available 15 Applying the Transformer to Character-level Transduction May 20, 2020 Grapheme-to-Phoneme Conversion Morphological Inflection
Code Code Available 15 Beyond Arabic: Software for Perso-Arabic Script Manipulation Jan 26, 2023 Transliteration
Code Code Available 15 Specializing Multilingual Language Models: An Empirical Study Jun 16, 2021 Dependency Parsing named-entity-recognition
Code Code Available 05 Sequence-to-sequence neural network models for transliteration Oct 29, 2016 Machine Translation Translation
Code Code Available 05 Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches Dec 31, 2024 Decoder Machine Translation
Code Code Available 05 Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages Sep 22, 2021 Multiple Choice Question Answering (MCQA) Natural Language Inference
Code Code Available 05 A Multi-cascaded Deep Model for Bilingual SMS Classification Nov 29, 2019 Classification General Classification
Code Code Available 05 Orthographic Transliteration for Kabyle Speech Recognition Nov 1, 2021 speech-recognition Speech Recognition
Code Code Available 05 On Biasing Transformer Attention Towards Monotonicity Apr 8, 2021 Grapheme-to-Phoneme Conversion Morphological Inflection
Code Code Available 05 Romanized to Native Malayalam Script Transliteration Using an Encoder-Decoder Framework Dec 13, 2024 Decoder Transliteration
Code Code Available 05 Towards Offensive Language Identification for Dravidian Languages Apr 1, 2021 Few-Shot Learning Language Identification
Code Code Available 05 Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts Aug 24, 2021 Language Identification Transfer Learning
Code Code Available 05 An Empirical Study of Chinese Name Matching and Applications Jul 1, 2015 Coreference Resolution Entity Linking
Code Code Available 05 How Transliterations Improve Crosslingual Alignment Sep 25, 2024 Sentence Transliteration
Code Code Available 05 Jailbreaking LLMs with Arabic Transliteration and Arabizi Jun 26, 2024 Transliteration
Code Code Available 05 A Large-scale Evaluation of Neural Machine Transliteration for Indic Languages Apr 1, 2021 Translation Transliteration
Code Code Available 05 Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study Jun 7, 2021 Data Augmentation Language Modeling
Code Code Available 05 How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs Dec 14, 2016 Machine Translation NMT
Code Code Available 05 Neural Machine Translation Techniques for Named Entity Transliteration Jul 1, 2018 Automatic Post-Editing Decoder
Code Code Available 05 Cross-Lingual Text Classification of Transliterated Hindi and Malayalam Aug 31, 2021 Benchmarking Classification
Code Code Available 05 Design Challenges in Named Entity Transliteration Aug 7, 2018 Decoder Transliteration
Code Code Available 05 Creating a Translation Matrix of the Bible's Names Across 591 Languages May 1, 2018 Entity Alignment Machine Translation
Code Code Available 05 Event detection in Twitter: A keyword volume approach Jan 3, 2019 Binary Classification Event Detection
Code Code Available 05 Context Independent Term Mapper for European Languages Sep 1, 2013 Information Retrieval Machine Translation
Code Code Available 05 Creating Large-Scale Multilingual Cognate Tables May 1, 2018 Machine Translation Semantic Textual Similarity
Code Code Available 05 Does Transliteration Help Multilingual Language Modeling? Jan 29, 2022 Diversity Language Modeling
Code Code Available 05 IIITT@Dravidian-CodeMix-FIRE2021: Transliterate or translate? Sentiment analysis of code-mixed text in Dravidian languages Nov 15, 2021 Marketing Sentiment Analysis
Code Code Available 05 Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages Sep 20, 2018 Entity Linking Transliteration
Code Code Available 05 Bilingual dictionaries for all EU languages May 1, 2014 All Machine Translation
Code Code Available 05 Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training Alignment Jun 28, 2024 Cross-Lingual Transfer Transliteration
Code Code Available 05 A Rule-based Kurdish Text Transliteration System Nov 26, 2018 Transliteration
Code Code Available 05 Can Small Language Models Learn, Unlearn, and Retain Noise Patterns? Jul 1, 2024 In-Context Learning Transliteration
Code Code Available 05 Efficient Sequence Labeling with Actor-Critic Training Sep 30, 2018 Decision Making NER
Code Code Available 05