Under Five Childern Classification Based On Characteristics of the Child and Family Using Extreme Gradient Boosted Trees Algorithm
Abstract
Despite various efforts to address the challenges of toddler stunting, the prevalence of stunting in
Tuban Regency remains high. This is not only caused by simple nutritional issues but also involves
other crucial aspects such as the lack of access to balanced nutrition, inadequate access to healthcare
services, and unresolved sanitation problems. Furthermore, socio-economic factors also play a
significant role in determining the prevalence of stunting. In an effort to understand and tackle this
issue, this research adopts a machine learning method with a primary focus on the application of the
Extreme Gradient Boosted Trees (XGBoost) algorithm. This method is chosen for its ability to
handle complex prediction issues and process large data, which can help identify patterns or
important factors contributing to the occurrence of stunting in toddlers more efficiently and
accurately. The research findings indicate that in addition to traditional factors such as the
height/length of toddlers and age, environmental factors such as inadequate sanitation conditions,
and socio-economic factors such as advanced maternal age, as well as exclusive breastfeeding
practices, also play an essential role in determining the occurrence of stunting. In the model testing,
it was found that the generated classification model could identify stunted toddlers with high
accuracy, reaching 95.9%, with a precision of 94.7%, recall of 98.1%, and an F1-Score of 96.4%.
These results demonstrate the strong potential of implementing machine learning methods using the
XGBoost algorithm to support the early identification of stunting cases, providing a solid foundation
for more effective health intervention efforts.