dc.contributor.author | ADRIENUFA, Kharisma Trinanda | |
dc.date.accessioned | 2025-08-04T07:13:17Z | |
dc.date.available | 2025-08-04T07:13:17Z | |
dc.date.issued | 2023-07-26 | |
dc.identifier.nim | 192410103019 | en_US |
dc.identifier.uri | https://repository.unej.ac.id/xmlui/handle/123456789/127707 | |
dc.description | validasi_repo_ratna_Agustus 2025; Finalisasi oleh Taufik_Alya Tgl 4 Agustus 2025 | en_US |
dc.description.abstract | Document clustering can’t avoid the problem of high dimensionality, which
can be overcome by combining the advantage of statistical and semantic features.
This study aims to determine the performance of clustering with the Stamantic
(statistical and semantic) feature extraction technique compared to the several Bag
Of Words Model (Bag Of All Word, Bag Of Noun, Bag Of Noun and Adjective) as
well as a comparison between Spherical K-Means and K-Means++ clustering
algorithm. Stamantic feature extraction use the Wordnet (Wn) database to form
semantic features, while statistical features are obtained from TF-IDF (Term
Frequency Inverse Document Frequency) word sense. Evaluation were carried out
on clustering results with several metrics. The highest Silhouette score is 0.162213
on the BONA feature from Pubmed dataset which clustered with K-Means++
algorithm. The highest Purity score around 0.949643 on the BONA feature from
Scopus dataset with Spherical K-Means algorithm. The highest AMI (Adjusted
Mutual Information) score is 0.880835 on the BONA feature from Scopus dataset
with Spherical K-Means clustering algorithm. The test results show that the
Stamantic feature loses to all BOW features. Due to the loss data information from
the effect of using Wn library and disambiguation process which inappropriate. | en_US |
dc.language.iso | other | en_US |
dc.publisher | Fakultas Ilmu Komputer | en_US |
dc.subject | DOCUMENT CLUSTERING | en_US |
dc.subject | LEXICAL CHAIN | en_US |
dc.title | Peningkatan Performa Clustering Pada Large Text Dataset Menggunakan Stamantic Spherical K-Means | en_US |
dc.title.alternative | Clustering Enhancement Of Large Text Dataset Using Stamantic Spherical K-Means | en_US |
dc.type | Skripsi | en_US |
dc.identifier.prodi | Informatika | en_US |
dc.identifier.pembimbing1 | Achmad Maududie ST, M.Sc. | en_US |
dc.identifier.pembimbing2 | Tio Dharmawan, S.Kom., M.Kom | en_US |
dc.identifier.validator | validasi_repo_ratna_Agustus 2025 | en_US |
dc.identifier.finalization | Taufik | en_US |