Using FastText-BERT to Extract Semantic Relations and Improve Sentiment Analysis of Persian Healthcare Service Reviews

Forootan, Faezeh; Khayami, Raouf; Shamsinejad, Pirooz

doi:10.30476/smsj.2025.100979.1469

Document Type : Original Article

Authors

¹ PhD Candidate, Department of Computer Engineering and Information Technology, Shiraz University of Technology, Shiraz, Iran

² Associate Professor, Department of Computer Engineering, and Information Technology Shiraz University of Technology, Shiraz, Iran

³ Assistant Professor, Department of Computer Engineering, and Information Technology Shiraz University of Technology, Shiraz, Iran

https://doi.org/10.30476/smsj.2025.100979.1469

Abstract

Introduction: The analysis of patients’ opinions is considered a valuable indicator for assessing the quality of healthcare services. The increasing volume of textual reviews about healthcare has made these reviews a critical factor in other patients’ decision-making processes when selecting medical services. Consequently, researchers aimed to extract valuable insights, classify sentiments, and identify patient needs and behavioral patterns through sentiment analysis, thereby developing appropriate strategies to enhance patient satisfaction. However, patient reviews often contain a significant amount of specialized terminology, and existing sentiment analysis tools are typically trained on general-domain data. Therefore, to analyze these reviews accurately, it is essential to employ models and their combinations in a way that ensures reliable and valid results.
Methods: To improve the efficiency and accuracy of sentiment analysis for Persian healthcare reviews, this study utilized the FastText-BERT hybrid embedding model for semantic relation extraction and the CNN-BiLSTM model for sentence-level sentiment classification.
Results: The proposed framework achieved an accuracy of 86% and an F1-score of 84.99%.
Conclusion: The results demonstrated that combining embedding models leverages the strengths of both approaches, enabling the identification of specialized and out-of-domain expressions and the extraction of semantic relationships between them. This combination significantly enhances the efficiency and accuracy of sentiment analysis.

Keywords

References

DataReportal [Internet]. Digital Around the World, Global Digital Insights. [Accessed Sep. 06, 2022]. Available from: https://datareportal.com/global-digital-overview
Digitalis [Internet]. Healthcare Marketing Statistics Digitalis Medical. [Accessed Sep. 06, 2022]. Available from: https://digitalismedical.com/blog/healthcare-marketing-statistics/
Abualigah L, Alfar HE, Shehab M, Hussein AMA. Sentiment analysis in healthcare: a brief review. Recent advances in NLP: the case of Arabic language. 2020:129-41 doi: 10.1007/978-3-030-34614-0_7.
Khattak FK, Jeblee S, Pou-Prom C, Abdalla M, Meaney C, Rudzicz F. A survey of word embeddings for clinical text. J Biomed Inform. 2019;100S:100057.
Parvinnia E, Mohammadi M, BANANZADEH A, Khayami SP. Analysis of data on patients with colon cancer using the data mining techniques Case study: Patients at colorectal research center of Shaheed Faghihi hospital in Shiraz. RAZI JOURNAL OF MEDICAL SCIENCES (JOURNAL OF IRAN UNIVERSITY OF MEDICAL SCIENCES),[online]. 2018;25(9):46-56
Boroumandzadeh M, Parvinnia E. Automated classification of BI-RADS in textual mammography reports. Turkish Journal of Electrical Engineering and Computer Sciences. 2021;29(2):632-47. doi; 10.3906/elk-2002-31.
Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Use of sentiment analysis for capturing patient experience from free-text comments posted online. J Med Internet Res. 2013;15(11):e239.
Jimenez-Zafra SM, Martin-Valdivia MT, Molina-Gonzalez MD, Urena-Lopez LA. How do we talk about doctors and drugs? Sentiment analysis in forums expressing opinions for medical domain. Artif Intell Med. 2019;93:50-7.
Garg S, editor Drug recommendation system based on sentiment analysis of drug reviews using machine learning. 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence); 2021: IEEE. doi10.1109/Confluence51648.2021.9377188:
Serrano-Guerrero J, Bani-Doumi M, Romero FP, Olivas JA. Understanding what patients think about hospitals: A deep learning approach for detecting emotions in patient opinions. Artif Intell Med. 2022;128:102298.
Bokaee Nezhad Z, Deihimi MA. Twitter sentiment analysis from Iran about COVID 19 vaccine. Diabetes Metab Syndr. 2022;16(1):102367.
Taghizadeh N, Doostmohammadi E, Seifossadat E, Rabiee HR, Tahaei MS. SINA-BERT: a pre-trained language model for analysis of medical texts in Persian. arXiv preprint arXiv:210407613. 2021.
Maity K, Kumar A, Saha S, editors. Attention Based BERT-FastText Model for Hate Speech and Offensive Content Identification in English and Hindi Languages. FIRE (Working Notes); 2021.
Badri N, Kboubi F, Chaibi AH. Combining fasttext and glove word embedding for offensive and hate speech text detection. Procedia Computer Science. 2022;207:769-78.doi: 10.1016/j.procs.2022.09.132.
Alotaibi FS, Gupta V. Sentiment analysis system using hybrid word embeddings with convolutional recurrent neural network. Int Arab J Inf Technol. 2022;19(3):330-5 doi: 10.34028/iajit/19/3/6.
Didi Y, Walha A, Wali A. COVID-19 tweets classification based on a hybrid word embedding method. Big Data and Cognitive Computing. 2022;6(2):58.doi; 10.3390/bdcc6020058.
Kinsta [Internet]. Usage Statistics and Market Share of Persian for Websites. [Accessed Oct. 13, 2023]. Available from: https://w3techs.com/technologies/details/cl-fa-
Asgarian E, Kahani M, Sharifi S. The impact of sentiment features on the sentiment polarity classification in Persian reviews. Cognitive Computation. 2018;10:117-35.doi: 10.1007/s12559-017-9513-1.
Jbene M, Tigani S, Saadane R, Chehri A. Deep Neural Network and Boosting Based Hybrid Quality Ranking for e-Commerce Product Search. Big Data and Cognitive Computing. 2021;5(3):35.doi: 10.3390/bdcc5030035.
Jurafsky D, Martin JH [Internet]. Speech and Language Processing.” [Accessed Oct. 14, 2023]. Available from: https://web.stanford.edu/~jurafsky/slp3/
Lil’Log [Internet]. Learning Word Embedding. [Accessed Nov. 23, 2020]. Available from: https://lilianweng.github.io/lil-log/2017/10/15/learning-word-embedding.html
Cheng Y, Yao L, Xiang G, Zhang G, Tang T, Zhong L. Text sentiment orientation analysis based on multi-channel CNN and bidirectional GRU with attention mechanism. IEEE Access. 2020;8:134964-75.doi: 10.1109/ACCESS.2020.3005823.
Liang H, Sun X, Sun Y, Gao Y. Text feature extraction based on deep learning: a review. EURASIP J Wirel Commun Netw. 2017;2017(1):211.
Rhanoui M, Mikram M, Yousfi S, Barzali S. A CNN-BiLSTM model for document-level sentiment analysis. Machine Learning and Knowledge Extraction. 2019;1(3):832-47.doi: 10.3390/make1030048.

Sadra Medical Journal

Using FastText-BERT to Extract Semantic Relations and Improve Sentiment Analysis of Persian Healthcare Service Reviews

References

References

Volume 13, Issue 1
January 2025
Pages 155-168

Using FastText-BERT to Extract Semantic Relations and Improve Sentiment Analysis of Persian Healthcare Service Reviews

References

References

Volume 13, Issue 1January 2025Pages 155-168

Volume 13, Issue 1
January 2025
Pages 155-168