Please use this identifier to cite or link to this item: https://repository.unej.ac.id/xmlui/handle/123456789/124774
Title: Analisis Retrieval-Based Question Answering dengan Menerapkan T5 Model pada Fitur FAQ UC3 Universitas Jember
Authors: HIMAWAN, Fadhli Nur
Keywords: Information
Information Retrieval
Question Answering
Retrieval Based Question Answering
T5 Model
Generative Paraphrase Model
Issue Date: 29-Jul-2024
Publisher: ILMU KOMPUTER
Abstract: Information is managed data that has been transformed into something of higher value and is crucial in various fields, especially in education, where it underpins academic quality. At Universitas Jember, the UC3 unit exemplifies the critical role of information by serving as the information integrator across the university. UC3 integrates the information needs of various units, yet it encounters challenges such as redundant questions from other units. To address this, an automated integration of inquiries through a FAQ system based on information retrieval is proposed. This retrieval-based question answering system employs retrieval embeddings to locate answers in a database that match user queries. However, the system's exact nature reduces its flexibility as a question-answering tool. To enhance response variation, this study proposes paraphrasing the retrieval-based responses using the Text to Text Transfer Transformers (T5) model. The primary focus of this research is the application of the T5 model to the retrieval-based question answering system that will be implemented in UC3 unit. To ensure the quality of the retrieval system, three evaluation metrics are employed: cosine similarity, BLEU, and ROUGE. Results from fine-tuning the T5 pretrained transformer model demonstrate high quality when evaluated with the cosine similarity metric. Both the T5-small and T5-base models achieve a cosine similarity score of 0.97. Similarly, both models attain identical scores on n-gram metrics, with 0.27 for BLEU and 0.70 for ROUGE. The key difference between the outputs of these two models lies in the linguistic and grammatical quality of the generated sentences or texts.
URI: https://repository.unej.ac.id/xmlui/handle/123456789/124774
Appears in Collections:UT-Faculty of Computer Science

Files in This Item:
File Description SizeFormat 
SKRIPSI_FADHLI-NUR-HIMAWAN-202410102039.pdf
  Until 2029-07-30
2.01 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools