SOTAVerified

Transliteration

Transliteration is a mechanism for converting a word in a source (foreign) language to a target language, and often adopts approaches from machine translation. In machine translation, the objective is to preserve the semantic meaning of the utterance as much as possible while following the syntactic structure in the target language. In Transliteration, the objective is to preserve the original pronunciation of the source word as much as possible while following the phonological structures of the target language.

For example, the city’s name “Manchester” has become well known by people of languages other than English. These new words are often named entities that are important in cross-lingual information retrieval, information extraction, machine translation, and often present out-of-vocabulary challenges to spoken language technologies such as automatic speech recognition, spoken keyword search, and text-to-speech.

Source: Phonology-Augmented Statistical Framework for Machine Transliteration using Limited Linguistic Resources

Papers

Showing 150 of 435 papers

TitleStatusHype
Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion UsersCode2
GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern GreekCode2
Applying the Transformer to Character-level TransductionCode1
ParaNames: A Massively Multilingual Entity Name CorpusCode1
An Ensemble Model of Word-based and Character-based Models for Japanese and Chinese Input MethodCode1
Question Answering Classification for Amharic Social Media Community Based QuestionsCode1
Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT ModelsCode1
Multilingual Text-to-Speech Synthesis for Turkic Languages Using TransliterationCode1
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented LanguagesCode1
Processing South Asian Languages Written in the Latin Script: the Dakshina DatasetCode1
KLPT – Kurdish Language Processing ToolkitCode1
Beyond Arabic: Software for Perso-Arabic Script ManipulationCode1
Show Me the World in My Language: Establishing the First Baseline for Scene-Text to Scene-Text TranslationCode1
ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using WikidataCode1
DeepScribe: Localization and Classification of Elamite Cuneiform Signs Via Deep LearningCode1
Leveraging Multilingual News Websites for Building a Kurdish Parallel CorpusCode1
ParsiPy: NLP Toolkit for Historical Persian Texts in PythonCode1
Sub-Character Tokenization for Chinese Pretrained Language ModelsCode1
A machine transliteration tool between Uzbek alphabetsCode1
A Comparison of Entity Matching Methods between English and Japanese Katakana0
A Framework for the Classification and Annotation of Multiword Expressions in Dialectal Arabic0
ARGUABLY at ComMA@ICON: Detection of Multilingual Aggressive, Gender Biased, and Communally Charged Tweets Using Ensemble and Fine-Tuned IndicBERT0
A Multilinear Approach to the Unsupervised Learning of Morphology0
A Digital Swedish-Yiddish/Yiddish-Swedish Dictionary: A Web-Based Dictionary that is also Available Offline0
A Deep Learning Based Approach to Transliteration0
A Multi-Orthography Parallel Corpus of Yiddish Nouns0
A Myanmar (Burmese)-English Named Entity Transliteration Dictionary0
Analyzing English-Spanish Named-Entity enhanced Machine Translation0
Analyzing Urdu Social Media for Sentiments using Transfer Learning with Controlled Translations0
Agreement on Target-bidirectional Neural Machine Translation0
amLite: Amharic Transliteration Using Key Map Dictionary0
A Comparative Study of Extremely Low-Resource Transliteration of the World's Languages0
A Bird's-eye View of Language Processing Projects at the Romanian Academy0
A Simple but Effective Approach to Improve Arabizi-to-English Statistical Machine Translation0
Arabic Retrieval Revisited: Morphological Hole Filling0
A Layered Language Model based Hybrid Approach to Automatic Full Diacritization of Arabic0
Addressing Noise in Multidialectal Word Embeddings0
A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation0
Applying Sanskrit Concepts for Reordering in MT0
Applying Neural Networks to English-Chinese Named Entity Transliteration0
A Classical Chinese Corpus with Nested Part-of-Speech Tags0
Approche Hybride pour la translit\'eration de l'Arabizi Alg\'erien : une \'etude pr\'eliminaire (A hybrid approach for the transliteration of Algerian Arabizi: A primary study)0
Arabic Diacritization: Stats, Rules, and Hacks0
Arabic Diacritization with Recurrent Neural Networks0
3arif: A Corpus of Modern Standard and Egyptian Arabic Tweets Annotated for Epistemic Modality Using Interactive Crowdsourcing0
Arabic to English Person Name Transliteration using Twitter0
Arabizi Detection and Conversion to Arabic0
Arabizi sentiment analysis based on transliteration and automatic corpus annotation0
AraNLP: a Java-based Library for the Processing of Arabic Text.0
Applying mpaligner to Machine Transliteration with Japanese-Specific Heuristics0
Show:102550
← PrevPage 1 of 9Next →

No leaderboard results yet.