The Ethics of AI in Language Assessment: A Critical Examination of Algorithmic Bias in Automated Speaking and Writing Tests

Roni Subhan; Alejandro Díaz; Karan Singh

doi:10.70177/ijlul.v3i6.2838

Roni Subhan ⁽¹⁾, Alejandro Díaz ⁽²⁾, Karan Singh ⁽³⁾

(1) Universitas Islam Negeri Kiai Haji Achmad Siddiq Jember,

Indonesia,

(2) University of Concepción,

Chile,

(3) Banaras Hindu University (BHU),

India

https://doi.org/10.70177/ijlul.v3i6.2838

Issue
Vol. 3 No. 6 (2025)

Submitted
9 July 2025

Published
24 December 2025

Keywords:

Artificial Intelligence, Automated Testing, Language Assessment

PDF

Abstract

Background. The growing adoption of artificial intelligence (AI) in language assessment has generated serious ethical concerns, particularly regarding algorithmic bias in automated speaking and writing tests.

Purpose. This study aimed to investigate the ethical challenges associated with algorithmic bias in AI-based language assessments, with a specific focus on automated speaking and writing evaluations.

Method. A qualitative research design was employed. Data were collected through a systematic analysis of existing literature on AI in language assessment, expert interviews with educators and AI developers, and a review of selected case studies involving automated language testing systems.

Results. The findings indicate that algorithmic bias is a significant issue in AI-driven language assessments. Biases in speech recognition and automated text evaluation were found to contribute to inaccurate scoring and unfair assessment outcomes for certain demographic groups. These biases have the potential to perpetuate systemic inequalities and undermine the validity and reliability of AI-based language testing.

Conclusion. The study concludes that although AI offers considerable potential for advancing language assessment, its ethical risks must be carefully addressed. Ensuring transparency, fairness, and accountability is essential in the design and implementation of AI-based assessment systems.

Full text article

Generated from XML file

References

Ahmad, M., & Delda, M. J. M. (2025). Navigating Challenges and Seizing Opportunities in AI-Driven Assessment: Insights From ChatGPT. In AI, Anal. And Assess. In High. Educ. (pp. 29–54). IGI Global; Scopus. https://doi.org/10.4018/979-8-3373-7057-6.ch002

Al Fraidan, A. (2025). Procedural Transparency and Legal Accountability to Sustain AI-Mediated Language Assessment in Saudi Arabia. SAGE Open, 15(4). Scopus. https://doi.org/10.1177/21582440251396113

Atmakuru, A., Shahini, A., Chakraborty, S., Seoni, S., Salvi, M., Hafeez-Baig, A., Rashid, S., Tan, R. S., Barua, P. D., Molinari, F., & Acharya, U. R. (2025). Artificial intelligence-based suicide prevention and prediction: A systematic review (2019–2023). Information Fusion, 114. Scopus. https://doi.org/10.1016/j.inffus.2024.102673

Bachtiar, B. (2025). Preparing citizens for the future of digital literacy and AI: With a focus on Indonesian EFL teachers. In Digit. Citizsh. And the futur. Of AI engagem., ethics, and priv. (pp. 405–440). IGI Global; Scopus. https://doi.org/10.4018/979-8-3693-9015-3.ch015

Baldwin, P. (2025). Audit-style framework for evaluating bias in large language models. Frontiers in Education, 10. Scopus. https://doi.org/10.3389/feduc.2025.1592037

Barchane, M., & Zahour, O. (2025). Assessing the Quality of Scientific Publications: A Thorough Analysis of Citation-Based and Content-Oriented Metrics for Evaluating Research Impact and Scholarly Contribution. Mathematical Modeling and Computing, 12(4), 1109–1120. Scopus. https://doi.org/10.23939/mmc2025.04.1109

Birahim, S. A. (2025). Contesting the algorithm: Advancing a right to challenge AI decisions under the GDPR for algorithmic fairness. Transforming Government: People, Process and Policy. Scopus. https://doi.org/10.1108/TG-05-2025-0148

Bouguettaya, A., Stuart, E. M., & Aboujaoude, E. (2025). Racial bias in AI-mediated psychiatric diagnosis and treatment: A qualitative comparison of four large language models. Npj Digital Medicine, 8(1). Scopus. https://doi.org/10.1038/s41746-025-01746-4

Ehrhardt, N., Renn, M., & Utz, S. (2025). Navigating fairness: Introducing the multidimensional AIM-FAIR scale for evaluating AI decision-making. AI and Society, 40(8), 6181–6199. Scopus. https://doi.org/10.1007/s00146-025-02354-2

El Arab, R. A., Alkhunaizi, M., Alhashem, Y. N., Al Khatib, A., Bubsheet, M., & Hassanein, S. (2025). Artificial intelligence in vaccine research and development: An umbrella review. Frontiers in Immunology, 16. Scopus. https://doi.org/10.3389/fimmu.2025.1567116

Goel, P. K., & Yadav, S. P. (2025). Bridging minds and machines: An introduction to NLP in mental health. In Demystifying the Role of Nat. Lang. Process. (NLP) in Ment. Health (pp. 1–22). IGI Global; Scopus. https://doi.org/10.4018/979-8-3693-4203-9.ch001

Issa, I. A., Youssef, O., & Issa, T. (2025). Can artificial intelligence improve the diagnosis and management of patients with eosinophilic esophagitis? World Journal of Gastroenterology, 31(38). Scopus. https://doi.org/10.3748/wjg.v31.i38.110999

Jagdale, R., & Deshmukh, M. (2025). Natural Language Processing in Finance: Techniques, Applications, and Future Directions. In Mach. Learn. And Model. Tech. In Financ. Data Science (pp. 411–434). IGI Global; Scopus. https://doi.org/10.4018/979-8-3693-8186-1.ch016

Khari, M. (2025). Beyond the Algorithm: Educational Justice for Plurilingual Students in AI-Assisted Classrooms. In Lang. Educ. And Hum. Rights in Democr. Educ. Settings (pp. 403–444). IGI Global; Scopus. https://doi.org/10.4018/979-8-3373-2670-2.ch015

Koutsoumpis, A. (2025). Psychometric properties of personality assessment using machine learning. Current Opinion in Psychology, 65. Scopus. https://doi.org/10.1016/j.copsyc.2025.102093

Kucukkaya, A., Aktas Bajalan, E., Moons, P., & Goktas, P. (2025). Equality, diversity, and inclusion in artificial intelligence-driven healthcare chatbots: Addressing challenges and shaping strategies. European Journal of Cardiovascular Nursing, 24(7), 1175–1181. Scopus. https://doi.org/10.1093/eurjcn/zvaf104

Madsen, D. Ø., & Toston, D. M. (2025). ChatGPT and Digital Transformation: A Narrative Review of Its Role in Health, Education, and the Economy. Digital, 5(3). Scopus. https://doi.org/10.3390/digital5030024

Manjula devi, C. M., Gobinath, A., & Ilango, D. (2025). Artificial intelligence in training and education. In AI Insights on Nucl. Med. (pp. 143–162). IGI Global; Scopus. https://doi.org/10.4018/979-8-3373-1275-0.ch007

Navarro, H. J., Sandoval-Rodriguez, C. L., & Galpin, I. (2025). Large language models in medicine: A systematic review of applications in medical, healthcare, and educational contexts. Periodicals of Engineering and Natural Sciences, 13(3), 629–670. Scopus. https://doi.org/10.21533/pen.v13.i3.460

Olawade, D. B., & Aienobe-Asekharen, C. A. (2025). Artificial intelligence in tobacco control: A systematic scoping review of applications, challenges, and ethical implications. International Journal of Medical Informatics, 202. Scopus. https://doi.org/10.1016/j.ijmedinf.2025.105987

Olive-Okafor, O., Tamunotonye, D., & Obasi, F. (2025). CCLBot An AI-Powered Chatbot for Streamlined Client Management and Automated Proposal Generation. In Triple Helix Niger. SciBiz Annual Conference 2024: THN SciBiz (pp. 141–154). Springer Nature; Scopus. https://doi.org/10.1007/978-3-031-81619-2_9

Ramesh, M. R. (2025). Challenges of integrating artificial intelligence with language curriculum: Addressing pedagogical, technological, and ethical barriers. In Mod. Methods for AI-Integr. Lang. Curriculum (pp. 205–229). IGI Global; Scopus. https://doi.org/10.4018/979-8-3693-9606-3.ch008

Reyes, J. S., Lohia, V. N., Almeida, T., Niranjan, A., Lunsford, L. D., & Hadjipanayis, C. G. (2025). Artificial intelligence in neurosurgery: A systematic review of applications, model comparisons, and ethical implications. Neurosurgical Review, 48(1). Scopus. https://doi.org/10.1007/s10143-025-03597-9

Riahi, A., Yazdani, M. S., Eshraghi, R., Houyeh, M. K., Bahrami, A., Khoshdooz, S., Amini, M., Behzadi, E., Khalaji, A., Moeini Taba, S. M., & Hashemian, S. M. R. (2025). Exploring the Potentials of Artificial Intelligence in Sepsis Management in the Intensive Care Unit. Critical Care Research and Practice, 2025(1). Scopus. https://doi.org/10.1155/ccrp/9031137

Roveta, A., Castello, L. M., Massarino, C., Francese, A., Ugo, F., & Maconi, A. (2025). Artificial Intelligence in Medical Education: A Narrative Review on Implementation, Evaluation, and Methodological Challenges. AI (Switzerland), 6(9). Scopus. https://doi.org/10.3390/ai6090227

Sholeh, M. I. (2025). Educational Transformation Through Artificial Intelligence: Implementation of AI Tools in the Teaching and Learning Process. In Implement. AI Tools for Lang. Teach. And Learning (pp. 25–40). IGI Global; Scopus. https://doi.org/10.4018/979-8-3693-7260-9.ch002

Tornimbene, B., Leiva Rioja, Z. B., Brownstein, J., Dunn, A., Faye, S., Kong, J., Malou, N., Nordon, C., Rader, B., & Morgan, O. (2025). Harnessing the power of artificial intelligence for disease-surveillance purposes. BMC Proceedings, 19(Suppl 4). Scopus. https://doi.org/10.1186/s12919-025-00320-w

Yang, J., Wang, N., Hu, Y., & Zhang, W. (2025). Legal Risk Assessment and Prevention in Artificial Intelligence-Assisted Health Care. Journal of Sichuan University (Medical Science), 56(1), 143–148. Scopus. https://doi.org/10.12182/20250160301

Yuan, S., Guo, L., & Xu, F. (2025). Artificial intelligence in nephrology: Predicting CKD progression and personalizing treatment. International Urology and Nephrology. Scopus. https://doi.org/10.1007/s11255-025-04878-4

Zhang, R. (2025). Multi-Agent Systems for Learning Assessment in Education: A Comprehensive Survey. Proc. Int. Conf. Educ. Knowl. Informatiz., EKI, 388–391. Scopus. https://doi.org/10.1145/3765325.3765390

Authors

Roni Subhan

Universitas Islam Negeri Kiai Haji Achmad Siddiq Jember

ronisubhan1@uinkhas.ac.id (Primary Contact)

Alejandro Díaz

University of Concepción

Karan Singh

Banaras Hindu University (BHU)

Subhan, R., Díaz, A., & Singh, K. (2025). The Ethics of AI in Language Assessment: A Critical Examination of Algorithmic Bias in Automated Speaking and Writing Tests. International Journal of Language and Ubiquitous Learning, 3(6), 307–317. https://doi.org/10.70177/ijlul.v3i6.2838