A Comparative Study of Advanced NLP Models for Accurate Gujarati Word Tagging

Pooja Bhatt; Pawan Whig

Home
Proceedings
Vol. 1 No. 4 (2026): LGPR Batch 3 Conference 2
Paper

A Comparative Study of Advanced NLP Models for Accurate Gujarati Word Tagging

Date Published : 1 May 2026

Contributors

Pooja Bhatt

Lincoln University College Malaysia

Author

Pawan Whig

Author

Keywords

Keywords: Natural Language Processing Gujarati Language POS Tagging Deep Learning Transformer Models Low Resource Languages

Proceeding

Vol. 1 No. 4 (2026): LGPR Batch 3 Conference 2

Track

Engineering and Sciences

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

Part-of-Speech (POS) tagging is a fundamental Natural Language Processing (NLP) task that assigns grammatical categories to words in a sentence. It plays a crucial role in many downstream applications such as machine translation, information retrieval, sentiment analysis, and syntactic parsing. However, developing robust POS taggers for low-resource languages such as Gujarati remains a challenging task due to limited annotated corpora and complex morphological structures. Gujarati is a morphologically rich language with free word order, which introduces significant ambiguity in linguistic analysis. This research presents a comparative study of statistical, machine learning, and deep learning approaches for Gujarati POS tagging. Specifically, Hidden Markov Model (HMM), Conditional Random Fields (CRF), Bi-directional Long Short-Term Memory (Bi-LSTM), and transformer-based models such as XLM-R are evaluated using a unified experimental setup. Experimental results show that transformer-based architectures achieve the highest tagging accuracy, while Bi-LSTM provides a strong trade-off between computational efficiency and performance. The study contributes a systematic evaluation framework and provides insights for designing efficient NLP tools for low-resource Indian languages.

References

No References

Downloads

PDF

How to Cite

Bhatt, P., & Whig, P. (2026). A Comparative Study of Advanced NLP Models for Accurate Gujarati Word Tagging. Sustainable Global Societies Initiative, 1(4). https://vectmag.com/sgsi/paper/view/382