PEARSON CORRELATION COEFFICIENT AND XGBOOST MODEL FOR LIVE BIRTH PREDICTION

Authors

  • Mrs. V.Kalaiselvi , Dr. S.Poongodi Author

DOI:

https://doi.org/10.7492/z1q1zy26

Keywords:

Male factor infertility, In Vitro Fertilisation (IVF), Intracytoplasmic Sperm Injection (ICSI), Live Birth Prediction, Pearson Correlation Coefficient (PCC), Feature Selection, Extreme Gradient Boosting (XG Boost), Machine Learning, Reproductive Health Analytics, Healthcare Data Mining

Abstract

Male variables and the metabolic health of both couples are frequently under-represented in clinical prediction models for assisted
reproductive technologies, which mostly concentrate on female ovarian reserve markers. Furthermore, nonlinear patterns in reproductive data
may be difficult for conventional parametric models. In this paper, The Extreme Gradient Boosting (XGboost) model was developed to classify
medical pregnancy result in couples after intracytoplasmic sperm injection (ICSI) or in vitro fertilization (IVF). Pearson Correlation Coefficient
(PCC) is used for feature selection. Dataset is collected from https://www.kaggle.com/datasets/deepakloganathan/live-birth-dataset PCC
calculates the two variables linear connection. It helps identify which clinical features are strongly associated with the target outcome (live
birth).Gradient boosting decision trees are the basis of the XG Boost as ensemble learning method. It is widely used in healthcare prediction
tasks due to its high accuracy and robustness. It enhances gradient boosting algorithms by combining weak learners (trees) sequentially to
minimize errors, using advanced regularization and parallel processing to live birth prediction.. XG Boost is scalable method which enhances
the prediction performance and speed of Gradient Boosting Machines (GBM). It accomplishes this by employing a novel tree learning technique
using distributed and parallel computing to speed up model discovery. The XG Boost-based prediction model performed exceptionally for
IVF/ICSI outcomes in male factor infertile couples. Metrics including precision, recall, f-measure, and accuracy are used to assess machine
learning algorithms such as Logistic Regression (LR), Random Forest (RF), Light Gradient Boosting Machine (Light GBM), and extreme
Gradient Boosting (XG Boost) based on the predictive elements.

Downloads

Published

1990-2026

Issue

Section

Articles

How to Cite

PEARSON CORRELATION COEFFICIENT AND XGBOOST MODEL FOR LIVE BIRTH PREDICTION. (2026). MSW Management Journal, 36(1), 4767-4772. https://doi.org/10.7492/z1q1zy26