Detection of AI-Generated Text Using Linguistic Features and Machine  Learning for Preserving Academic Integrity

Saleem; Dr. Basant Kumar

Home
Proceedings
Vol. 1 No. 5 (2026): LGPR Batch 4 Conference 1
Paper

Detection of AI-Generated Text Using Linguistic Features and Machine Learning for Preserving Academic Integrity

Date Published : 28 April 2026

Contributors

Saleem

University of Technology and Applied Sciences

Author

Dr. Basant Kumar

Modern College of Business and Science, Muscat, Oman

Author

Keywords

Large Language Models AI-generated text detection misinformation academic integrity hybrid framework stylometric features adversarial robustness scalability generalization ethical concerns.

Proceeding

Vol. 1 No. 5 (2026): LGPR Batch 4 Conference 1

Track

Engineering and Sciences

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

Advancement of Large Language Models (LLMs) has reshaped how information is produced and consumed, enabling highly coherent, human-like text across diverse applications such as academic research, tutoring, summarization, and code generation. While these advancements enhance productivity and learning efficiency, they also raise pressing ethical concerns, including misinformation, plagiarism, and academic integrity violations. Current detection methods often rely on model-specific parameters or hidden states, which limits their adaptability to new and evolving architectures. Similarly, linguistic and embedding-based approaches, though effective, are computationally intensive, reducing scalability and hindering real-time deployment. Proprietary detection tools further complicate accessibility, as they are costly and frequently lack accuracy against rapidly advancing models. To address these challenges, a lightweight hybrid framework integrating linguistic and machine learning techniques is proposed. By combining stylometric, syntactic, and pragmatic features, this approach aims to improve generalization across domains while maintaining computational efficiency. Evaluated on multi-domain datasets with adversarial testing, the framework demonstrates resilience against paraphrasing and evolving LLMs. Such solutions are crucial for ensuring trustworthy AI integration, balancing the benefits of LLMs with the need for ethical safeguards and scalable detection mechanisms.

References

No References

Downloads

PDF

How to Cite

Abdul Samad, S. R., & Dr. Basant Kumar, D. B. K. (2026). Detection of AI-Generated Text Using Linguistic Features and Machine Learning for Preserving Academic Integrity. Sustainable Global Societies Initiative, 1(5). https://vectmag.com/sgsi/paper/view/540