Abstractive Text Summarization to Generate Indonesian News Highlight Using Transformers Model

I Gusti Agung Intan Utami Putri; I Nyoman Prayana Trisna; Ni Kadek Dwi Rusjayanthi

doi:10.51519/journalisi.v7i2.1082

Authors

I Gusti Agung Intan Utami Putri Udayana University, Indonesia
I Nyoman Prayana Trisna Udayana University, Indonesia
Ni Kadek Dwi Rusjayanthi Udayana University, Indonesia

DOI:

https://doi.org/10.51519/journalisi.v7i2.1082

Keywords:

Abstractive Summarization, Transformers, mBART, IndoT5, News Highlight

Abstract

The increasing volume of information has led to the phenomenon of information overload, a condition where individuals struggle to filter and comprehend information efficiently within a limited time. To address this issue, automatic text summarization serves as an essential approach. This research aims to assess effectiveness of two transformer-based models, IndoT5 and mBART, by comparing their ability to generate abstractive summaries (highlight) of Indonesian news articles. The abstractive approach allows models to generate new sentences with more natural language structures compared to extractive methods. Fine-tuning for both models was conducted using a dataset comprising 10,410 news articles from Tempo.co, each containing full news content and a corresponding highlight used as a reference. ROUGE and BERT-Score metrics were employed in the evaluation process to assess structural and semantic correspondence between the references and the generated summaries. Results show that IndoT5 outperformed in terms of ROUGE-1 (0.43087), ROUGE-2 (0.29143), ROUGE-L (0.39224), BERT-Score Recall (0.89130), and F1 (0.87708), indicating its capability to generate complete and relevant news highlight. Meanwhile, mBART achieved a higher BERT-Score Precision (0.86717) but tended to generate less informative outputs. The findings of this research are expected to aid in enhancing the coherence and efficiency of abstractive summarization systems.

Downloads

Download data is not yet available.

References

Q. A. Itsnaini, M. Hayaty, A. D. Putra, and N. A. M. Jabari, “Abstractive Text Summarization using Pre-Trained Language Model ‘Text-to-Text Transfer Transformer (T5),’” ILKOM Jurnal Ilmiah, vol. 15, no. 1, pp. 124–131, Apr. 2023, doi: 10.33096/ilkom.v15i1.1532.124-131.

Syinchan, “Teknik Skimming untuk Pembaca yang Sering ‘Zoning Out,’” https://www.kompasiana.com/voyageonyx3449/67232577c925c4404c4eddf3/teknik-skimming-untuk-pembaca-yang-sering-zoning-out?page=1&page_images=1.

A. R. Lubis et al., “Enhancing Text Summarization with a T5 Model and Bayesian Optimization,” Revue d’Intelligence Artificielle, vol. 37, no. 5, pp. 1213–1219, 2023, doi: 10.18280/ria.370513.

N. Giarelis, C. Mastrokostas, and N. Karacapilidis, “Abstractive vs. Extractive Summarization: An Experimental Review,” Jul. 01, 2023, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/app13137620.

A. N. Khasanah And M. Hayaty, “Abstractive-Based Automatic Text Summarization On Indonesian News Using GPT-2,” JURTEKSI (Jurnal Teknologi dan Sistem Informasi), vol. 10, no. 1, pp. 9–18, Dec. 2023, doi: 10.33330/jurteksi.v10i1.2492.

A. S. Wijaya and A. S. Girsang, “Augmented-Based Indonesian Abstractive Text Summarization using Pre-Trained Model mT5,” International Journal of Engineering Trends and Technology, vol. 71, no. 11, pp. 190–200, 2023, doi: 10.14445/22315381/IJETT-V71I11P220.

E. S. Lim et al., “ICON: Building a Large-Scale Benchmark Constituency Treebank for the Indonesian Language,” Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories, pp. 37–53, 2023.

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” Proceedings of the 28th International Conference on Computational Linguistics, pp. 757–770, Dec. 2020.

R. Adelia, S. Suyanto, and U. N. Wisesty, “Indonesian abstractive text summarization using bidirectional gated recurrent unit,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 581–588. doi: 10.1016/j.procs.2019.09.017.

C. P. R. Ardyanti, Y. Wibisono, and R. Megasari, “Peringkasan Teks berita Berbahasa Indonesia Menggunakan LSTM dan Transformer,” 2024.

I. N. Purnama and W. Utami, “Implementasi Peringkas Dokumen Berbahasa Indonesia Menggunakan Metode Text To Text Transfer Transformer (T5),” 2023.

D. Dharrao, M. Mishra, A. Kazi, M. Pangavhane, P. Pise, and A. M. Bongale, “Summarizing Business News: Evaluating BART, T5, and PEGASUS for Effective Information Extraction,” Revue d’Intelligence Artificielle, vol. 38, no. 3, pp. 847–855, Jun. 2024, doi: 10.18280/ria.380311.

M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” Oct. 2019.

K. Cao et al., “DMSeqNet-mBART: A State-of-the-Art Adaptive-DropMessage Enhanced mBART Architecture for Superior Chinese Short News Text Summarization,” May 03, 2024. doi: 10.36227/techrxiv.171470744.49740569/v1.

G. Hartawan, D. Sa’adillah Maylawati, and W. Uriawan, “Bidirectional And Auto-Regressive Transformer (BART) For Indonesian Abstractive Text Summarization,” 2024.

B. Hanindhito, B. Patel, and L. K. John, “Large Language Model Fine-tuning with Low-Rank Adaptation: A Performance Exploration,” in Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering, New York, NY, USA: ACM, May 2025, pp. 92–104. doi: 10.1145/3676151.3719377.

Y. M. Wazery, M. E. Saleh, A. Alharbi, and A. A. Ali, “Abstractive Arabic Text Summarization Based on Deep Learning,” Comput Intell Neurosci, vol. 2022, 2022, doi: 10.1155/2022/1566890.

A. Vaswani et al., “Attention Is All You Need,” 2017.

F. Koto, J. H. Lau, and T. Baldwin, “Liputan6: A Large-scale Indonesian Dataset for Text Summarization,” Nov. 2020.

Y. Liu et al., “Multilingual Denoising Pre-training for Neural Machine Translation,” Jan. 2020.

Y. Tang et al., “Multilingual Translation with Extensible Multilingual Pretraining and Finetuning,” Aug. 2020.

K. Ganesan, “ROUGE2.0: Updated and Improved Measures for Evaluation of Summarization Tasks,” vol. 1, no. 1, Mar. 2018.

B. Ay, F. Ertam, G. Fidan, and G. Aydin, “Turkish abstractive text document summarization using text to text transfer transformer,” Alexandria Engineering Journal, vol. 68, pp. 1–13, Apr. 2023, doi: 10.1016/j.aej.2023.01.008.

T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BertScore: Evaluating Text Generation With BERT,” Feb. 2020.

R. H. Astuti, M. Muljono, and S. Sutriawan, “Indonesian News Text Summarization Using MBART Algorithm,” Scientific Journal of Informatics, vol. 11, no. 1, pp. 155–164, Feb. 2024, doi: 10.15294/sji.v11i1.49224.

Abstractive Text Summarization to Generate Indonesian News Highlight Using Transformers Model

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

Most read articles by the same author(s)

publisher

sidebar

certificate

template

gs-citation

index

stat