TY - CHAP
T1 - A Machine Learning Model for Predicting Fetal Hemoglobin Levels in Sickle Cell Disease Patients
AU - Oikonomou, Konstantinos
AU - Steinhöfel, Kathleen
AU - Menzel, Stephan
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/9/24
Y1 - 2021/9/24
N2 - Sickle cell disease is one of the commonest genetic diseases and is defined as a decrease in hemoglobin concentration in the blood. The main known factor that can alleviate the disease is the persistence of fetal hemoglobin (HbF), and thus the aim of our research is to build a model to predict the HbF% of patients based on the three regulating genes of the disease (BCL11A, Xmm1-HBG2 and HBS1L-MYB). A machine learning approach is employed in order to improve the accuracy of the model, with various algorithms of that type being explored. In the end, the K-nearest neighbors algorithm is chosen and an initial version of it is implemented and tested. Finally, the algorithm is optimized enabling our optimized model to predict the HbF% of a patient with 87.25% accuracy, a major improvement over the existing alternative that has a mean error of 336.33%. Furthermore, 93.45% of our predictions have a sheer error that is less than 0.5, and all these facts reinforce the strength of our model as a quick and accurate estimation tool for small and medium-sized clinical trials, where fast HbF% predictions can help adjust for genetic background variability that obscures test outcomes.
AB - Sickle cell disease is one of the commonest genetic diseases and is defined as a decrease in hemoglobin concentration in the blood. The main known factor that can alleviate the disease is the persistence of fetal hemoglobin (HbF), and thus the aim of our research is to build a model to predict the HbF% of patients based on the three regulating genes of the disease (BCL11A, Xmm1-HBG2 and HBS1L-MYB). A machine learning approach is employed in order to improve the accuracy of the model, with various algorithms of that type being explored. In the end, the K-nearest neighbors algorithm is chosen and an initial version of it is implemented and tested. Finally, the algorithm is optimized enabling our optimized model to predict the HbF% of a patient with 87.25% accuracy, a major improvement over the existing alternative that has a mean error of 336.33%. Furthermore, 93.45% of our predictions have a sheer error that is less than 0.5, and all these facts reinforce the strength of our model as a quick and accurate estimation tool for small and medium-sized clinical trials, where fast HbF% predictions can help adjust for genetic background variability that obscures test outcomes.
KW - Fetal hemoglobin
KW - Machine learning prediction model
KW - Sickle cell disease
UR - http://www.scopus.com/inward/record.url?scp=85116484947&partnerID=8YFLogxK
U2 - 10.1007/978-981-16-2377-6_10
DO - 10.1007/978-981-16-2377-6_10
M3 - Conference paper
AN - SCOPUS:85116484947
SN - 9789811623769
VL - 1
T3 - Lecture Notes in Networks and Systems
SP - 79
EP - 91
BT - Proceedings of Sixth International Congress on Information and Communication Technology
A2 - Yang, Xin-She
A2 - Sherratt, Simon
A2 - Dey, Nilanjan
A2 - Joshi, Amit
PB - Springer Science and Business Media Deutschland GmbH
T2 - 6th International Congress on Information and Communication Technology, ICICT 2021
Y2 - 25 February 2021 through 26 February 2021
ER -