RESCUING THE UZBEK LANGUAGE FROM DIGITAL EXTINCTION: A COMPUTATIONAL FRAMEWORK BASED ON TRANSFORMER ARCHITECTURES
Main Article Content
Abstract
The digital divide between globally dominant languages and low-resource languages threatens the plurality of language cultures and languages. As AI and NLP are becoming the go-to interfaces between information exchanges, languages with inadequate computational architectures are likely to experience “digital extinction”. One prominent example of this kind of situation is the Uzbek language, whose millions of speakers face such a crisis today because its complex agglutinative morphology and a lack of high-quality annotated datasets are largely missing in such systems. This article outlines a strategic framework that combines philological expertise with contemporary Transformer-based architectures. In this way, this research seeks to reduce semantic errors in AI processing by at least 30% by establishing a dedicated morphological analyzer and a robust digital corpus, to make sure that the Uzbek language can remain a critical tool for science, innovation, and global discourse in the digital era.
Article Details
References
Asian Institute of Technology. (n.d.). Linguistic Diversity and AI: Case Studies from Southeast Asia.
Global NLP Forum. (2025). The Impact of Suffix-Stripping Rules on Semantic Accuracy in Turkic Languages.
International Journal of Computational Linguistics. (2024). Transformer-based Approaches for Agglutinative Languages.
Tukhtaeva, I. Y. (2024). Reflections of Intercultural Relations in the Translation Process. International Journal of Computational Linguistics.
UNESCO. (2025). World Atlas of Languages: Assessing Digital Footprints and Linguistic Survival.
University of Malaya. (2024). Guidelines for Language Preservation in the Digital Age.