Show simple item record

dc.contributor.authorDHARMAWAN, Tio
dc.contributor.authorCANDRAMAYA, Chinta ’Aliyyah
dc.contributor.authorWIDHARTA, Vandha Pradwiyasma
dc.date.accessioned2023-03-07T07:03:20Z
dc.date.available2023-03-07T07:03:20Z
dc.date.issued2023-01-01
dc.identifier.urihttps://repository.unej.ac.id/xmlui/handle/123456789/112593
dc.description.abstractEach university collects many undergraduate theses data but has yet to process it to make it easier for students to find references as desired. This study aims to classify and compare the grouping of documents using expert and simple clustering methods. Experts have done ground truth using OR Boolean Retrieval and keyword generation. The best cluster was discovered by the experiments using the K-Means, K-Medoids, and DBSCAN clustering methods and using Euclidean, Manhattan, City Block, and Cosine Similarity metrics. The cluster with the best Silhouette Score compared to the accuracy of the categorization of each document. The K-Means clustering method and the Cosine Similarity metric gave the best results with a Silhouette Score value of 0.105534. The comparison between ground truth and the best cluster results shows an accuracy of 33.42%. The result shows that the simple clustering method cannot handle data with Negative Skewness and Leptokurtic Kurtosis.en_US
dc.language.isoenen_US
dc.publisherINTERNATIONAL JOURNAL OF INNOVATION IN ENTERPRISE SYSTEMen_US
dc.subjectDocument Clusteringen_US
dc.subjectText Miningen_US
dc.subjectRelevant Termen_US
dc.subjectInformation Retrievalen_US
dc.subjectTopic Identificationen_US
dc.titleForming Dataset of The Undergraduate Thesis using Simple Clustering Methodsen_US
dc.typeArticleen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record