skip to main content

SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT

*Titin Siswantining orcid scopus  -  Departemen Matematika, Universitas Indonesia, Indonesia
Stanley Pratama  -  Department of Mathematics, Universitas Indonesia, Indonesia
Devvi Sarwinda orcid  -  Department of Mathematics, Universitas Indonesia, Indonesia
Open Access Copyright (c) 2022 MEDIA STATISTIKA under http://creativecommons.org/licenses/by-nc-sa/4.0.

Citation Format:
Abstract
Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences.
Fulltext View|Download
Keywords: natural language processing; natural language sentence matching; recurrent neural network

Article Metrics:

  1. Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3:1137-1155
  2. Bowman, S. R., Angeli, G., Potts, C., & Manning, C. D. (2015). A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326
  3. Chen, Z., Zhang, H., Zhang, X., & Zhao, L. (2018). Quora question pairs. URL https://www. kaggle.com/c/quora-question-pairs
  4. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
  5. Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
  6. Crystal, David. (1980). A First Dictionary of Linguistics and Phonetics. Boulder, Colorado: Westview Press
  7. Glorot, X., Bordes, A., & Bengio, Y. (2011, June). Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 315-323). JMLR Workshop and Conference Proceedings
  8. Google. (2021). nnlm-id-dim128-with-normalization - Token based text embedding trained on Indonesian Google News 3B corpus. Tensorflow hub. Retrieved December 20, 2021, from https://tfhub.dev/google/nnlm-id-dim128-with-normalization/2
  9. Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks, 18(5-6), 602-610
  10. Gruber, N., & Jockisch, A. (2020). Are GRU cells more specific and LSTM cells more sensitive in motive classification of text?. Frontiers in artificial intelligence, 3, 40
  11. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
  12. Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780
  13. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
  14. Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2017). Hyperband: A novel bandit-based approach to hyperparameter optimization. The Journal of Machine Learning Research, 18(1), 6765-6816
  15. Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11), 2673-2681
  16. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958
  17. Su, Y., & Kuo, C. C. J. (2019). On extended long short-term memory and dependent bidirectional recurrent neural network. Neurocomputing, 356, 151-161
  18. Tóth, L. (2013, May). Phone recognition with deep sparse rectifier neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 6985-6989). IEEE
  19. Wang, Z., Mi, H., Hamza, W., & Florian, R. (2016b). Multi-perspective context matching for machine comprehension. arXiv preprint arXiv:1612.04211
  20. Wang, Z., Mi, H., & Ittycheriah, A. (2016a). Sentence similarity learning by lexical decomposition and composition. arXiv preprint arXiv:1602.07019
  21. Yin, W., Schütze, H., Xiang, B., & Zhou, B. (2015). Abcnn: Attention-based convolutional neural network for modeling sentence pairs. arXiv preprint arXiv:1512.05193

Last update:

No citation recorded.

Last update: 2024-11-20 23:16:21

No citation recorded.