Mitigating Intersectional Gender and Racial Bias in Sentiment Analysis: A T5-Based Data Augmentation Approach for English and Low-Resource Bengali

  • Md Saiful Islam Zhongyuan University of Technology 450007. China

Abstract

This paper analyzes the performance of a UAV-enabled backscatter communication system for low power IoT networks. The system consists of a power beacon that wirelessly energizes an energy constrained source node, which then transmits data to a UAV-mounted backscatter relay; the UAV reflects the signal to a destination. A realistic composite fading model is adopted: Rician fading for UAV-involved links (source-to- UAV and UAV-to-destination) to account for dominant line of sight (LoS) components, and Rayleigh fading for the ground based power beacon to source link due to shadowing. The main contribution is the derivation of an exact closed form expression for the system’s outage probability using the Meijer G-function, enabling efficient performance evaluation without extensive simulations. The analysis incorporates a time switching protocol where the source alternates between energy harvesting and data transmission. Numerical results validate the analytical model and reveal critical insights: outage performance improves significantly with higher Rician K-factors (indicating stronger LoS), and an optimal time switching ratio exists that minimizes outage by balancing energy harvesting and data transmission durations. Additionally, hardware parameters such as backscatter coefficient and energy conversion efficiency strongly influence system reliability. The study also highlights the trade off between target data rate and outage probability, showing that higher beacon transmit power supports higher data rates at fixed reliability levels. These findings provide practical guidance for designing efficient UAV-assisted backscatter IoT systems

References

[1] Albladi, A., Islam, M., Seals, C. (2025). Sentiment Analysis of Twitter Data Using NLP Models: A Comprehensive Review, IEEE Access, vol. 13, p. 30444–30468. doi:10.1109/ACCESS.2025.3541494.

[2] He, M. (2024). Enhancing E-Health with Natural Language Processing: The Role of Sentiment Analysis in Modern Healthcare, Trans. Comput. Sci. Intell. Syst. Res., vol. 7, p. 285–290, Nov. doi:10.62051/1b 4mdm36.

[3] Lak, A. J., Boostani, R., Alenizi, F. A., Mohammed, A. S., Fakhrahmad, S. M. (2024). RoBERTa, Res Next and Bilstm with self attention: The ultimate trio for customer sentiment analysis, Appl. Soft Comput., vol. 164, p. 112018, Oct. doi:10.1016/j.asoc.2024.112018.

[4] Venugopal, J. P., Subramanian, A. A. V., Sundaram, G., Rivera, M., Wheeler, P. (2024). A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data, Appl. Sci., 14 (23). p. 11471, Dec. doi:10.3390/app142311471.

[5] Corizzo, R., Hafner, F. S. (2024). Mitigating social bias in sentiment classification via ethnicity aware algorithmic design, Soc. Netw. Anal. Min., 14 (1) p. 208, Oct. doi:10.1007/s13278-024-01369-9.

[6] Iqbal, M., Karim, A., Kamiran, F. (2019). Balancing Prediction Errors for Robust Sentiment Classification, ACM Trans. Knowl. Discov. Data, 13 (3). p. 1–21. Jun. doi:10.1145/3328795.

[7] Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K. W. (2018). Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods, In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human LanguageTechnologies, Volume 2 (Short Papers), New Orleans, Louisiana: Association for Computational Linguistics, doi:10.18653/v1/n18-2003.

[8] Mandis, I. S. (2021). Reducing Racial and Gender Bias in Machine Learning and Natural Language Processing Tasks Using a GAN Approach, Int. J. High Sch. Res., 3 (6) p. 17–24, Dec. doi:10.36838/v3i6.5.

[9] Gao, L., Zhan, H., Sheng, V. S. (2023). Mitigate Gender Bias Using Negative Multi-task Learning, Neural Process. Lett., 55 (8) p. 11131–11146, Dec. doi:10.1007/s11063-023-11368-0.

[10] Garb, H. N. (2021). Race bias and gender bias in the diagnosis of psychological disorders, Clin. Psychol. Rev., 90 p. 102087, Dec. doi:10.1016/j.cpr.2021.102087

[11] Ding, Y., You, J., Machulla, T. K., Jacobs, J., Sen, P., Höllerer, T. (2022). Impact of Annotator Demographics on Sentiment Dataset Labeling, Proc. ACM Hum-Comput. Interact., 6, no. CSCW2, p. 1–22, Nov. doi: 10.1145/ 3555632.

[12] Madaio, M., Egede, L., Subramonyam, H., Wortman Vaughan, J., Wallach, H. (2022). Assessing the Fairness of AI Systems: AI Practitioners’ Processes, Challenges, and Needs for Support, Proc. ACM Hum.Comput. Interact., 6, no. CSCW1, p. 1–26, Mar. doi:10.1145/3512899.

[13] Wang, A., Ramaswamy, V. V., Russakovsky, O. (2022). Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentation, and Performing Evaluation, in 2022 ACM Conference on Fairness Accountability and Transparency, Seoul Republic of Korea: ACM, Jun. p. 336–349) doi:10.1145/ 3531146.3533101.

[14] Garcia, N., Hirota, Y., Wu, Y., Nakashima, Y. (2023). Uncurated Image Text Datasets: Shedding Light on Demographic Bias, In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada: IEEE, Jun. 2023, p. 6957–6966. doi:10.1109/cvpr52729.2023.00672.

[15] Bhardwaj, R., Majumder, N., Poria, S. (2021). Investigating Gender Bias in BERT, Cogn. Comput., 13 (4) p. 1008–1018). Jul. doi:10.1007/s12559-021-09881-2.

[16] Hamel, E., Kani, N. (2024). Factors That Influence Automatic Recognition of African American Vernacular English in Machine Learning Models, IEEEACM Trans. Audio Speech Lang. Process., vol. 32, p. 509–516, doi:10.1109/TASLP.2023.3331139.

[17] Czarnowska, P., Vyas, Y., Shah, K. (2021). Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics, Trans. Assoc. Comput. Linguist., vol. 9, p. 1249–1267, Nov. doi: 10.1162/tacl_a_00425.

[18] A Field et al. (2023). Examining risks of racial biases in NLP tools for child protective services, in 2023 ACM Conference on Fairness, Accountability, and Transparency, Chicago IL USA: ACM, Jun., p. 1479–1492. doi:10.1145/3593013.3594094.

[19] Turner Lee, N. (2018). Detecting racial bias in algorithms and machine learning, J. Inf. Commun. Ethics Soc., 16 (3) p. 252–260, Aug. doi:10.1108/JICES-06-2018-0056.

[20] Lalor, J., Yang, Y., Smith, K., Forsgren, N., Abbasi, A. (2022). Benchmarking Intersectional Biases in NLP,” In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, United States: Association for Computational Linguistics, doi:10.18653/v1/2022.naacl-main.263.

[21] Luitel, S., Liu, Y., Anwar, M. (2025). Investigating fairness in machine learning based audio sentiment analysis, AI Ethics, 5 (2) p. 1099–1108, Apr. doi:10.1007/s43681-024-00453-2.

[22] Chi, Z., Huang, H., Liu, L., Bai, Y., Gao, X., Mao, X. L. (2024). Can Pretrained English Language Models Benefit Non English NLP Systems in Low-Resource Scenarios IEEEACM Trans. Audio Speech Lang. Process., vol. 32, p. 1061–1074, doi:10.1109/TASLP.2023.3267618.

[23] Shukla, M., Kumar, A. (2023). An Experimental Analysis of Deep Neural Network Based Classifiers for Sentiment Analysis Task, IEEE Access, vol. 11, p. 36929–36944, doi:10.1109/ACCESS.2023.3266640.

[25] Jang, B., Kim, I., Kim, J. W. (2019). Word2vec convolutional neural networks for classification of news articles and tweets, PLOS ONE, vol. 14, no. 8, p. e0220976, Aug. doi:10.1371/journal.pone.0220976.

[24] Iatsenko, D. V. (2023). Texts Classification with the usage of Neural Network based on the Word2vec’s Words Representation, Int. J. Soft Comput., 14 (2) p. 1–13, May doi:10.5121/ijsc.2023.14201.

[25] Choi, J., Lee, S. W. (2020). Improving FastText with inverse document frequency of subwords, Pattern Recognit. Lett., vol. 133, p. 165–172, May doi:10.1016/j.patrec.2020.03.003.

[26] Dundar, E. B., Alpaydýn, E. Learning Word Representations with Deep Neural Networks for Turkish, in 2019 27th Signal Processing and Communications Applications Conference (SIU), Sivas, Turkey: IEEE, Apr. 2019, p. 1–4. doi:10.1109/SIU.2019.8806491.

[27] Tripathi., Agarwal, R. (2025). Bias Mitigation in NLP: Automated Detection and Correction, Int. J. Res. Mod. Eng. Emerg. Technol., 13 (5) p. 45–60, May. doi:10.63345/ijrmeet.org.v13.i5.130503.

[28] Zhou, H., Inkpen, D., Kantarci, B. (2024). Evaluating and Mitigating Gender Bias in Generative Large Language Models, Int. J. Comput. Commun. CONTROL, 19 (6) Nov. doi:10.15837/ijccc.2024.6.6853.

[29] Sattigeri, P., Hoffman, S. C., Chenthamarakshan, V., Varshney, K. R. (2019). Fairness GAN: Generating datasets with fairness properties using a generative adversarial network, IBM J. Res. Dev., 63 (4/5), p. 3:1-3:9, Jul. doi:10.1147/jrd.2019.2945519.

[30] Zhang, B. H., Lemoine, B., Mitchell, M. (2018). Mitigating Unwanted Biases with Adversarial Learning, In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, New Orleans LA USA: ACM, Dec. p. 335–340. doi:10.1145/3278721.3278779.

[31] Lohia, P. K., Ramamurthy, Natesan, K., Bhide, Saha, M. D., Varshney, K. R., Puri, R. (2019). Bias MitigationPost- processing for Individual and Group Fairness, In: ICASSP 2019 - 2019 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom: IEEE, May, p. 2847–2851. doi:10.1109/icassp.2019.8682620.

[32] Sze Khoo L., et al., (2023). Exploring and Repairing Gender Fairness Violations in Word Embedding-based Sentiment Analysis Model through Adversarial Patches, in 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Taipa, Macao: IEEE, Mar., p. 651–662. doi:10.1109/ saner56733.2023.00066.

[33] Asyrofi, M. H., Yang, Z., Yusuf, I. N. B., Kang, H. J., Thung, F., Lo, D. (2021). BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems, IEEE Trans. Softw. Eng., p. 1–1, doi:10.1109/ tse.2021.3136169.

[34] Almuzaini, A. A., Singh, V. K. (2020). Balancing Fairness and Accuracy in Sentiment Detection using Multiple Black Box Models, In: Proceedings of the 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in Multimedia, Seattle WA USA: ACM, Oct., p. 13–19. doi:10.1145/3422 841.3423536

[35] Chen Xinying, V., Hooker, J. N. (2023). A guide to formulating fairness in an optimization model, Ann. Oper. Res., 326 (1) p. 581–619, Jul. doi:10.1007/s10479-023-05264-y.

[36] Dogra, V., et al., (2022). A Complete Process of Text Classification System Using State of the Art NLP Models, Comput. Intell. Neurosci., vol. 2022, p. 1–26, Jun., doi:10.1155/2022/1883698.

[37] Xu, S., Zhang, C., Hong, D. (2022). BERT based NLP techniques for classification and severity modeling in basic warranty data study, Insur. Math. Econ., vol. 107, p. 57–67, Nov. doi:10.1016/j.insmatheco.20 22.07.013.

[38] Gosho, M., Ohigashi, T., Nagashima, K., Ito, Y., Maruo, K. (2023). Bias in Odds Ratios From Logistic Regression Methods With Sparse Data Sets, J. Epidemiol., 33 (6) p. 265–275, Jun. doi:10.2188/jea.JE20210089.

[39] Tang, Y., Martin, R. (2024). Empirical Bayes inference in sparse high dimensional generalized linear models, Electron. J. Stat., 18 (2) Jan. doi:10.1214/24-EJS2274.

[40] Du, K. L., Jiang, B., Lu, J., Hua, J., Swamy, M. N. S. (2024). Exploring Kernel Machines and Support Vector Machines: Principles, Techniques, and Future Directions, Mathematics, 12 (24) p. 3935, Dec., doi:10. 3390/ math12243935.

[41] Perros, H. G. (2021). Support Vector Machines, in An Introduction to IoT Analytics, 1st ed., First edition. Boca Raton/:CRC Press, 2021: Chapman and Hall/CRC, p. 279–302. doi:10.1201/9781003139041-11.

[42] Schonlau, M., Zou, R. Y. (2020). The random forest algorithm for statistical learning, Stata J. Promot. Commun. Stat. Stata, 20 (1) p. 3–29, Mar., doi:10.1177/1536867X20909688.

[43] Hong, S., Lynn, H. S. (2020). Accuracy of random forest based imputation of missing data in the presence of non-normality, non-linearity, and interaction, BMC Med. Res. Methodol., 20 (1) p. 199, Dec. doi: 10.1186/ s12874-020-01080-1.
Published
2025-12-13
How to Cite
ISLAM, Md Saiful. Mitigating Intersectional Gender and Racial Bias in Sentiment Analysis: A T5-Based Data Augmentation Approach for English and Low-Resource Bengali. Journal of Digital Information Management(JDIM), [S.l.], v. 23, n. 4, dec. 2025. ISSN 0972-7272. Available at: <https://dline.info/ojs/index.php/jdim/article/view/559>. Date accessed: 21 apr. 2026.