Analisis Retrieval-Based Question Answering dengan Menerapkan T5 Model pada Fitur FAQ UC3 Universitas Jember
Abstract
Information is managed data that has been transformed into something of higher value and is crucial in various fields, especially in education, where it underpins academic quality. At Universitas Jember, the UC3 unit exemplifies the critical role of information by serving as the information integrator across the university. UC3 integrates the information needs of various units, yet it encounters challenges such as redundant questions from other units. To address this, an automated integration of inquiries through a FAQ system based on information retrieval is proposed. This retrieval-based question answering system employs retrieval embeddings to locate answers in a database that match user queries. However, the system's exact nature reduces its flexibility as a question-answering tool. To enhance response variation, this study proposes paraphrasing the retrieval-based responses using the Text to Text Transfer Transformers (T5) model. The primary focus of this research is the application of the T5 model to the retrieval-based question answering system that will be implemented in UC3 unit. To ensure the quality of the retrieval system, three evaluation metrics are employed: cosine similarity, BLEU, and ROUGE. Results from fine-tuning the T5 pretrained transformer model demonstrate high quality when evaluated with the cosine similarity metric. Both the T5-small and T5-base models achieve a cosine similarity score of 0.97. Similarly, both models attain identical scores on n-gram metrics, with 0.27 for BLEU and 0.70 for ROUGE. The key difference between the outputs of these two models lies in the linguistic and grammatical quality of the generated sentences or texts.