BibTex Citation Data :
@article{JSINBIS65126, author = {Widyawan Widyawan and Bayu Utomo and Muhammad Rizala}, title = {A Novel Fusion of Machine Learning Methods for Enhancing Named Entity Recognition in Indonesian Language Text}, journal = {Jurnal Sistem Informasi Bisnis}, volume = {14}, number = {4}, year = {2024}, keywords = {NER; BERT; Pre-training; Machine Learning}, abstract = { One of the important implementations in machine learning is Named Entity Recognition (NER), which is used to process text and extract entities such as people, organizations, laws, religions, and locations. NER for the Indonesian language still faces significant challenges due to the lack of high-quality labelled datasets, which limits the development of more advanced models. To address this issue, we utilized several pre-trained BERT models (bert-base-uncased, indobenchmark/indobert-base-p1, indolem/indobert-base-uncased) and datasets (NERGRIT-IndoNLU, NERGRIT-Corpus, NERUGM, and NERUI). This study proposes a novel fusion approach by integrating deep learning architectures such as CNN, Bi-LSTM, Bi-GRU, and CRF to detect 19 entities. This approach enhances BERT’s sequence modelling and feature extraction capabilities, while CRF improves entity prediction by enforcing global word-sequence constraints. Experimental results demonstrate that the fusion approach outperforms previous methods. On the bert-base-uncased dataset, accuracy reached 94.75%, while indobenchmark/indobert-base-p1 achieved 95.75%, and indolem/indobert-base-uncased achieved 95.85%. This study emphasizes the effectiveness of combining deep learning architectures with pre-trained transformers to improve NER performance in the Indonesian language. The proposed methodology offers significant advancements in entity extraction for languages with limited datasets, such as Indonesian. }, issn = {2502-2377}, pages = {311--320} doi = {10.21456/vol14iss4pp311-320}, url = {https://ejournal.undip.ac.id/index.php/jsinbis/article/view/65126} }
Refworks Citation Data :
One of the important implementations in machine learning is Named Entity Recognition (NER), which is used to process text and extract entities such as people, organizations, laws, religions, and locations. NER for the Indonesian language still faces significant challenges due to the lack of high-quality labelled datasets, which limits the development of more advanced models. To address this issue, we utilized several pre-trained BERT models (bert-base-uncased, indobenchmark/indobert-base-p1, indolem/indobert-base-uncased) and datasets (NERGRIT-IndoNLU, NERGRIT-Corpus, NERUGM, and NERUI). This study proposes a novel fusion approach by integrating deep learning architectures such as CNN, Bi-LSTM, Bi-GRU, and CRF to detect 19 entities. This approach enhances BERT’s sequence modelling and feature extraction capabilities, while CRF improves entity prediction by enforcing global word-sequence constraints. Experimental results demonstrate that the fusion approach outperforms previous methods. On the bert-base-uncased dataset, accuracy reached 94.75%, while indobenchmark/indobert-base-p1 achieved 95.75%, and indolem/indobert-base-uncased achieved 95.85%. This study emphasizes the effectiveness of combining deep learning architectures with pre-trained transformers to improve NER performance in the Indonesian language. The proposed methodology offers significant advancements in entity extraction for languages with limited datasets, such as Indonesian.
Article Metrics:
Last update:
Last update: 2024-11-21 08:49:53
Authors who submit the manuscripts to Journal JSINBIS must understand and agree that if the manuscript is accepted for publication, the copyright of the article belongs to JSINBIS and Diponegoro University as the journal publisher.
Copyright includes the exclusive right to reproduce and provide articles in all forms and media, including reprints, photographs, microfilm and any other similar reproductions, as well as translations. The author reserves the rights to the following:
JSINBIS and Diponegoro University and the Editors make every effort to ensure that no false or misleading data, opinions or statements are published in this journal. The content of articles published in JSINBIS is the sole and exclusive responsibility of the respective authors.
Copyright transfer agreement can be found here: [Copyright transfer agreement in doc] and [Copyright transfer agreement in pdf].
JSINBIS (Jurnal Sistem Informasi Bisnis) is published by the Magister of Information Systems, Post Graduate School Diponegoro University. It has e-ISSN: 2502-2377 dan p-ISSN: 2088-3587 . This is a National Journal accredited SINTA 2 by RISTEK DIKTI No. 48a/KPT/2017.
Journal JSINBIS which can be accessed online by http://ejournal.undip.ac.id/index.php/jsinbis is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats