Please use this identifier to cite or link to this item: https://repository.unej.ac.id/xmlui/handle/123456789/124006
Title: Klasifikasi Penyakit Hati Menggunakan Algoritma XGBoost, Neural Network, dan Decision Tree dengan Seleksi Fitur Least Absolute Shrinkage and Selection Operator
Authors: KURNIASARI, Ila Rahayu
Keywords: LASSO
LIVER DISEASE CLASSIFICATION
XGBOOST
NEURAL NETWORK
DECISION TREE
MACHINE LEARNING
Issue Date: 25-Jul-2024
Publisher: Fakultas Ilmu Komputer
Abstract: Liver disease is a health problem that significantly affects liver function and quality of life. Common causes include infection, injury, certain medications, exposure to harmful substances, and genetic factors. This study aims to classify liver diseases using three machine learning algorithms for comparison: XGBoost, Neural Network (NN), and Decision Tree (DT). There are 3 datasets used, consisting of the Indian Liver Patient Dataset (ILPD) from the UCI Machine Learning Repository, the BUPA Medical Research Liver Disorder liver disease dataset, and from the Kaggle Liver Patient Dataset website. Research begins with data pre-processing, which involves cleaning and normalizing the data, to prepare for further analysis. Two scenarios are used for classification. In the first scenario, the pre-processed dataset undergoes feature selection using the Least Absolute Shrinkage and Selection Operator (LASSO) to identify the attributes most correlated with liver disease. In the second scenario, the pre-processed dataset goes directly to classification. The processed dataset is then used to train and test the XGBoost, NN (Multilayer Perceptron), and DT models. Research findings show that the disease prediction. Using LASSO feature selection, the Kaggle dataset shows the best results with XGBoost achieving an accuracy of 1.0, followed by the ILPD dataset using XGBoost with an accuracy of 0.70, and finally the Liver Disorder dataset using NN with an accuracy of 0.68. Without feature selection, the dataset from the Kaggle website using the XGBoost algorithm reaches a value of 1, followed by the NN algorithm from the Kaggle dataset with an accuracy value of 0.82 and the lowest among the three datasets is the Decision Tree algorithm from the ILPD dataset with an accuracy of 0.66.
Description: Finalisasi oleh Taufik_Lina Tgl 15 Agustus 2024
URI: https://repository.unej.ac.id/xmlui/handle/123456789/124006
Appears in Collections:UT-Faculty of Computer Science

Files in This Item:
File Description SizeFormat 
Ila Rahayu Kurniasari - Ilmu Komputer.pdf
  Until 2029-08-12
1.05 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools