Abstractive Text Summarization to Generate Indonesian News Highlight Using Transformers Model
DOI:
https://doi.org/10.51519/journalisi.v7i2.1082Keywords:
Abstractive Summarization, Transformers, mBART, IndoT5, News HighlightAbstract
The increasing volume of information has led to the phenomenon of information overload, a condition where individuals struggle to filter and comprehend information efficiently within a limited time. To address this issue, automatic text summarization serves as an essential approach. This research aims to assess effectiveness of two transformer-based models, IndoT5 and mBART, by comparing their ability to generate abstractive summaries (highlight) of Indonesian news articles. The abstractive approach allows models to generate new sentences with more natural language structures compared to extractive methods. Fine-tuning for both models was conducted using a dataset comprising 10,410 news articles from Tempo.co, each containing full news content and a corresponding highlight used as a reference. ROUGE and BERT-Score metrics were employed in the evaluation process to assess structural and semantic correspondence between the references and the generated summaries. Results show that IndoT5 outperformed in terms of ROUGE-1 (0.43087), ROUGE-2 (0.29143), ROUGE-L (0.39224), BERT-Score Recall (0.89130), and F1 (0.87708), indicating its capability to generate complete and relevant news highlight. Meanwhile, mBART achieved a higher BERT-Score Precision (0.86717) but tended to generate less informative outputs. The findings of this research are expected to aid in enhancing the coherence and efficiency of abstractive summarization systems.
Downloads
References
Q. A. Itsnaini, M. Hayaty, A. D. Putra, and N. A. M. Jabari, “Abstractive Text Summarization using Pre-Trained Language Model ‘Text-to-Text Transfer Transformer (T5),’” ILKOM Jurnal Ilmiah, vol. 15, no. 1, pp. 124–131, Apr. 2023, doi: 10.33096/ilkom.v15i1.1532.124-131.
Syinchan, “Teknik Skimming untuk Pembaca yang Sering ‘Zoning Out,’” https://www.kompasiana.com/voyageonyx3449/67232577c925c4404c4eddf3/teknik-skimming-untuk-pembaca-yang-sering-zoning-out?page=1&page_images=1.
A. R. Lubis et al., “Enhancing Text Summarization with a T5 Model and Bayesian Optimization,” Revue d’Intelligence Artificielle, vol. 37, no. 5, pp. 1213–1219, 2023, doi: 10.18280/ria.370513.
N. Giarelis, C. Mastrokostas, and N. Karacapilidis, “Abstractive vs. Extractive Summarization: An Experimental Review,” Jul. 01, 2023, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/app13137620.
A. N. Khasanah And M. Hayaty, “Abstractive-Based Automatic Text Summarization On Indonesian News Using GPT-2,” JURTEKSI (Jurnal Teknologi dan Sistem Informasi), vol. 10, no. 1, pp. 9–18, Dec. 2023, doi: 10.33330/jurteksi.v10i1.2492.
A. S. Wijaya and A. S. Girsang, “Augmented-Based Indonesian Abstractive Text Summarization using Pre-Trained Model mT5,” International Journal of Engineering Trends and Technology, vol. 71, no. 11, pp. 190–200, 2023, doi: 10.14445/22315381/IJETT-V71I11P220.
E. S. Lim et al., “ICON: Building a Large-Scale Benchmark Constituency Treebank for the Indonesian Language,” Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories, pp. 37–53, 2023.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” Proceedings of the 28th International Conference on Computational Linguistics, pp. 757–770, Dec. 2020.
R. Adelia, S. Suyanto, and U. N. Wisesty, “Indonesian abstractive text summarization using bidirectional gated recurrent unit,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 581–588. doi: 10.1016/j.procs.2019.09.017.
C. P. R. Ardyanti, Y. Wibisono, and R. Megasari, “Peringkasan Teks berita Berbahasa Indonesia Menggunakan LSTM dan Transformer,” 2024.
I. N. Purnama and W. Utami, “Implementasi Peringkas Dokumen Berbahasa Indonesia Menggunakan Metode Text To Text Transfer Transformer (T5),” 2023.
D. Dharrao, M. Mishra, A. Kazi, M. Pangavhane, P. Pise, and A. M. Bongale, “Summarizing Business News: Evaluating BART, T5, and PEGASUS for Effective Information Extraction,” Revue d’Intelligence Artificielle, vol. 38, no. 3, pp. 847–855, Jun. 2024, doi: 10.18280/ria.380311.
M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” Oct. 2019.
K. Cao et al., “DMSeqNet-mBART: A State-of-the-Art Adaptive-DropMessage Enhanced mBART Architecture for Superior Chinese Short News Text Summarization,” May 03, 2024. doi: 10.36227/techrxiv.171470744.49740569/v1.
G. Hartawan, D. Sa’adillah Maylawati, and W. Uriawan, “Bidirectional And Auto-Regressive Transformer (BART) For Indonesian Abstractive Text Summarization,” 2024.
B. Hanindhito, B. Patel, and L. K. John, “Large Language Model Fine-tuning with Low-Rank Adaptation: A Performance Exploration,” in Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering, New York, NY, USA: ACM, May 2025, pp. 92–104. doi: 10.1145/3676151.3719377.
Y. M. Wazery, M. E. Saleh, A. Alharbi, and A. A. Ali, “Abstractive Arabic Text Summarization Based on Deep Learning,” Comput Intell Neurosci, vol. 2022, 2022, doi: 10.1155/2022/1566890.
A. Vaswani et al., “Attention Is All You Need,” 2017.
F. Koto, J. H. Lau, and T. Baldwin, “Liputan6: A Large-scale Indonesian Dataset for Text Summarization,” Nov. 2020.
Y. Liu et al., “Multilingual Denoising Pre-training for Neural Machine Translation,” Jan. 2020.
Y. Tang et al., “Multilingual Translation with Extensible Multilingual Pretraining and Finetuning,” Aug. 2020.
K. Ganesan, “ROUGE2.0: Updated and Improved Measures for Evaluation of Summarization Tasks,” vol. 1, no. 1, Mar. 2018.
B. Ay, F. Ertam, G. Fidan, and G. Aydin, “Turkish abstractive text document summarization using text to text transfer transformer,” Alexandria Engineering Journal, vol. 68, pp. 1–13, Apr. 2023, doi: 10.1016/j.aej.2023.01.008.
T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BertScore: Evaluating Text Generation With BERT,” Feb. 2020.
R. H. Astuti, M. Muljono, and S. Sutriawan, “Indonesian News Text Summarization Using MBART Algorithm,” Scientific Journal of Informatics, vol. 11, no. 1, pp. 155–164, Feb. 2024, doi: 10.15294/sji.v11i1.49224.
Downloads
Published
Issue
Section
License
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














