Algoritma K-Means Clustering Untuk Pengelompokan Ayat Al Quran Pada Terjemahan Bahasa Indonesia

DOI: https://doi.org/10.21456/vol6iss2pp164-176

Article Metrics: (Click on the Metric tab below to see the detail)

Article Info
Submitted: 27-09-2016
Published: 26-12-2016
Section: Research Articles
Fulltext PDF Tell your colleagues Email the author

Clustering process can make the process of grouping data so that the data in the same cluster have high similarity with the data in the same cluster. One of the clustering algorithm that is widely used is the K-Means because it has advantages such as simple, efficient, easy to understand and easy to apply. Grouping paragraph dealing with similar themes will allow users to find a theme in the Qur'an. This study aims to produce an information system that can perform grouping Quran with K-Means method. This research was conducted with a pre-processing stage process for text data, weighting by TFIDF, grouping data with K-Means clustering, labeling data for keywords. The resulting system is able to display a verse in groups associated with the keyword. The test results by using the index on the silhouette of Surah Al Fatihah generate positive value of 0.336 which means that the data in the right group, while the frequency of keywords versus the amount of data to produce a percentage of 53%, which means the keyword represents half of the data in the cluster. Tests also showed that the test results silhouette will be directly proportional to the number of clusters and inversely proportional to the number of data dimensions. To increase the value of testing required centroid method for early elections, the reduction of data dimensions and methods of measurement of distance and similarity.

Keywords

Clustering, K-Means, Al Quran, Silhoutte etection

  1. Miftachur Robani 
    , Indonesia
  2. Achmad Widodo 
    Universitas Diponegoro
    Fakultas Teknik
  1. Abbas, N.H, 2009. Quran ‘Search for a Concept’ Tool and Website, Thesis Master of Science, The University of Leeds.
  2. Aggarwal C.C, Zhai C, 2012. Mining Text Data, Springer, New York.
  3. Ahlgren, P. Colliander, C., 2009. Document-document similarity approaches and science mapping : Experimental comparison of five approaches. Journal of Informetrics 3. 49-63.
  4. Ahmad, O., 2013. A Survey of Searching and Information Extraction on a Classical Text Using Ontology-based semantics modeling: A Case of Quran. Life Science Journal.
  5. Alghamdi, H.M., 2014. Arabic Web Pages Clustering And Annotation Using Semantic Class Features, Journal of King Saud University – Computer and Information Sciences 26, 388–397.
  6. Arifin, A.Z, Mahendra I., Ciptaningtyas H., 2010. Enhanced Confix Stripping Stemmer And Ants Algorithm For Classifying News Document In Indonesian Language, The 5th International Conference on Information & Communication Technology and Systems, pp 149-158.
  7. Atwell, E., Dukes, K., Sharaf, A.-B., Louw, N. H. B., Shawar, B. A., McEnery, T., et al. 2010. Understanding the Quran: A new Grand Challenge for Computer Science and Artificial Intelligence. Paper presented at the British Computer Society Workshop, Edinburgh.
  8. Darawaty, I, 2010. Intelegent Searching using Association Analysis for law Documents of Indonesian Government, Second International Conference on Advances in Computing, Control and Telecomunication Technologies, pp 122-124.
  9. Ksasbeh M.Z., 2009. Using Ontology to Define the Structure of the Holy Quran, 4th International Conference on Information Technology, Amman.
  10. Larose, D.T., 2005. Discovering Knowledge in Data : An Introduction to Data Mining, Wiley-Interscience, New Jersey.
  11. Liu B., 2007. Web Data Mining, Springer, New York.
  12. Manning, C.D., 2008. Introduction to Information Retrieval, Cambridge University Press, New York.
  13. Mardia, K.V., Kent, J.T., Bibby, J.M., 1979. Multivariate Analysis. Academic Press, London.
  14. Pulukadang D.R, 2014. Pendekataan Clustering untuk Pengelolaan Pengetahuan pada Sistem Manajemen Pengetahuan, Tesis Magister Sistem Informasi Undip.
  15. Rousseeuw, P.J., 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, Journal of Computational and Applied Mathematics 20, pg 53-65.
  16. Steinbach, M., Karypis, G., Kumar, V., 2000. A Comparison of Document Clustering Techniques, Technical Report of University of Minnesota, Minnesota.