• Login
    View Item 
    •   Home
    • UNDERGRADUATE THESES (Koleksi Skripsi Sarjana)
    • UT-Faculty of Computer Science
    • View Item
    •   Home
    • UNDERGRADUATE THESES (Koleksi Skripsi Sarjana)
    • UT-Faculty of Computer Science
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Implementasi DBSCAN dan Latent Dirichlet Allocation pada Pemodelan Topik Skripsi di Fakultas Ilmu Komputer Universitas Jember

    Thumbnail
    View/Open
    Tugas Akhir - Chinta 'Aliyyah Candramaya - 192410101108.pdf (1.764Mb)
    Date
    2022-11-24
    Author
    CANDRAMAYA, Chinta 'Aliyyah
    Metadata
    Show full item record
    Abstract
    Thesis is part of “Tri Dharma on Higher Education” in Indonesia which must be completed by a student before completing higher education. The importance of theses is often not complemented by topic grouping management that can maximize users' access to them. Previous research has shown that DBSCAN handles noise and classifies text data well, also Latent Dirichlet Allocation works great in extracting latent topics in a group of documents. This study aims to identify the best epsilon (eps) and minimum points (minpts) scores for clustering thesis topics with DBSCAN based on analysis and silhouette scores as well as analyzing the topics of thesis through Latent Dirichlet Allocation. The experimental scenario is made with two measurement metrics which are Euclidean Distance and Cosine Similarity. The research begins with data collection, data selection, data pre-processing, term weighting, clustering experiments and evaluation, topic modeling, and end up with analyzing topic, cluster, and noise. The best cluster is formed at epsilon 0.6 with 2 minimum points which produces 81 clusters with 143 noises. The cluster reached 0.095 on Silhouette Score with 0.595 Dunn Index, and 0.385 average Coherence Score. Topic modeling shows several thesis topics with the greatest interest in the topics of Machine Learning, Software Construction, and IT/IS Evaluation. The experimental results show that the best cluster is formed with silhouette score close to 0. This is because the thesis came from the same scientific group which causes the data distribution to form a leptokurtic. The results of clustering with DBSCAN can group data specifically based on objectives, methods, and research objects. For this reason, some noise is formed not because the document has bad writing, but because other similar documents are not found to be grouped.
    URI
    https://repository.unej.ac.id/xmlui/handle/123456789/114216
    Collections
    • UT-Faculty of Computer Science [1026]

    UPA-TIK Copyright © 2024  Library University of Jember
    Contact Us | Send Feedback

    Indonesia DSpace Group :

    University of Jember Repository
    IPB University Scientific Repository
    UIN Syarif Hidayatullah Institutional Repository
     

     

    Browse

    All of RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Context

    Edit this item

    UPA-TIK Copyright © 2024  Library University of Jember
    Contact Us | Send Feedback

    Indonesia DSpace Group :

    University of Jember Repository
    IPB University Scientific Repository
    UIN Syarif Hidayatullah Institutional Repository