Klasifikasi Kemiskinan Multidimensi di Provinsi Bengkulu Menggunakan Algoritma Random Forest

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Fakultas Ilmu Komputer

Abstract

Poverty is a complex social issue that cannot be fully explained solely through a monetary perspective. Therefore, the multidimensional poverty approach is used to describe household welfare conditions more comprehensively by considering multiple aspects of life. Bengkulu Province is one of the regions in Indonesia with a relatively high poverty rate compared to the national average, making further analysis necessary to understand the characteristics of poor households from a multidimensional perspective. This study aims to classify the multidimensional poverty status of households in Bengkulu Province and identify the most influential factors using the Random Forest algorithm. The measurement of multidimensional poverty is conducted using the Alkire–Foster method with indicators covering the dimensions of health, education, living standards, and economic conditions. The data used in this study are derived from the microdata of the National Socioeconomic Survey (Susenas) conducted in March 2025 by Statistics Indonesia. After preprocessing and feature engineering, the dataset is divided into 70% training data and 30% testing data. The Random Forest model is developed with hyperparameter tuning using the Randomized Search method combined with 5-fold cross-validation. The results show that the Random Forest model achieves an accuracy of 80.3%, a balanced accuracy and recall of 53.1%, , a precision of 48,7%, a F1-score of 50.4%, and a weighted F1-score of 80.4%. Weighted distribution analysis using the sampling weight variable (WERT) indicates that the predicted poverty distribution is generally consistent with the Alkire–Foster method, although the model tends to estimate a slightly higher proportion of households in the severely poor category. Feature importance analysis reveals that total expenditure per capita is the most influential variable, followed by nutritional adequacy ratios of protein, fat, and carbohydrates relative to recommended dietary allowances (RDA).

Description

Validasi dan Finalisasi Repositori File 15 Juni 2026_Kholif Basri

Citation

Endorsement

Review

Supplemented By

Referenced By