SOTAVerified

Transliteration

Transliteration is a mechanism for converting a word in a source (foreign) language to a target language, and often adopts approaches from machine translation. In machine translation, the objective is to preserve the semantic meaning of the utterance as much as possible while following the syntactic structure in the target language. In Transliteration, the objective is to preserve the original pronunciation of the source word as much as possible while following the phonological structures of the target language.

For example, the city’s name “Manchester” has become well known by people of languages other than English. These new words are often named entities that are important in cross-lingual information retrieval, information extraction, machine translation, and often present out-of-vocabulary challenges to spoken language technologies such as automatic speech recognition, spoken keyword search, and text-to-speech.

Source: Phonology-Augmented Statistical Framework for Machine Transliteration using Limited Linguistic Resources

Papers

Showing 2650 of 435 papers

TitleStatusHype
Exploring the Role of Transliteration in In-Context Learning for Low-resource Languages Written in Non-Latin Scripts0
Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?Code0
Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training AlignmentCode0
Jailbreaking LLMs with Arabic Transliteration and ArabiziCode0
Review of Computational Epigraphy0
TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated DataCode0
ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using WikidataCode1
Swa Bhasha: Message-Based Singlish to Sinhala Transliteration0
Charles Translator: A Machine Translation System between Ukrainian and Czech0
Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs0
Training a Bilingual Language Model by Mapping Tokens onto a Shared Character Space0
Cross-Lingual Transfer from Related Languages: Treating Low-Resource Maltese as Multilingual Code-Switching0
TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language ModelsCode0
Language Detection for Transliterated Content0
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints0
Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment0
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP0
Exploring Linguistic Similarity and Zero-Shot Learning for Multilingual Translation of Dravidian Languages0
Show Me the World in My Language: Establishing the First Baseline for Scene-Text to Scene-Text TranslationCode1
Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT ModelsCode1
Multilingual Neural Machine Translation System for Indic to Indic Languages0
Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition0
DeepScribe: Localization and Classification of Elamite Cuneiform Signs Via Deep LearningCode1
Multilingual Text-to-Speech Synthesis for Turkic Languages Using TransliterationCode1
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented LanguagesCode1
Show:102550
← PrevPage 2 of 18Next →

No leaderboard results yet.