BibTex Citation Data :
@article{JSINBIS71815, author = {Sutikman Sutikman and Heri Sutanto and Aris Widodo}, title = {Resolving Data Imbalance using SMOTE for the Analysis and Prediction of Hate Speech Sentences}, journal = {Jurnal Sistem Informasi Bisnis}, volume = {15}, number = {2}, year = {2025}, keywords = {Hate Speech Analysis; Data Imbalance; SMOTE; Machine Learning Models; Twitter Sentiment Classification}, abstract = { Hate speech is characterized as a form of communication that expresses hostility or discontent towards particular individuals, groups, or ethnicities, with the intent to belittle one party. This research aims to examine hate speech expressions on Twitter, assessing their categorization as hate speech through the application of machine learning methodologies. The study incorporates feature engineering techniques, such as Term Frequency-Inverse Document Frequency (TF-IDF) and the Synthetic Minority Over-sampling Technique (SMOTE), to mitigate challenges related to data imbalance. The machine learning models utilized include Logistic Regression (LR), Decision Tree (DT), Gradient Boosting (GB), and Random Forest (RF). Among these models, Logistic Regression (LR) demonstrated the highest efficacy, achieving an accuracy of 91.43%, precision of 88.83%, recall of 93.99%, and an F1 score of 97.10%. }, issn = {2502-2377}, pages = {198--203} doi = {10.14710/vol15iss2pp198-203}, url = {https://ejournal.undip.ac.id/index.php/jsinbis/article/view/71815} }
Refworks Citation Data :
Hate speech is characterized as a form of communication that expresses hostility or discontent towards particular individuals, groups, or ethnicities, with the intent to belittle one party. This research aims to examine hate speech expressions on Twitter, assessing their categorization as hate speech through the application of machine learning methodologies. The study incorporates feature engineering techniques, such as Term Frequency-Inverse Document Frequency (TF-IDF) and the Synthetic Minority Over-sampling Technique (SMOTE), to mitigate challenges related to data imbalance. The machine learning models utilized include Logistic Regression (LR), Decision Tree (DT), Gradient Boosting (GB), and Random Forest (RF). Among these models, Logistic Regression (LR) demonstrated the highest efficacy, achieving an accuracy of 91.43%, precision of 88.83%, recall of 93.99%, and an F1 score of 97.10%.
Article Metrics:
Last update:
Last update: 2025-06-14 03:51:57
Authors who submit the manuscripts to Journal JSINBIS must understand and agree that if the manuscript is accepted for publication, the copyright of the article belongs to JSINBIS and Diponegoro University as the journal publisher.
Copyright includes the exclusive right to reproduce and provide articles in all forms and media, including reprints, photographs, microfilm and any other similar reproductions, as well as translations. The author reserves the rights to the following:
JSINBIS and Diponegoro University and the Editors make every effort to ensure that no false or misleading data, opinions or statements are published in this journal. The content of articles published in JSINBIS is the sole and exclusive responsibility of the respective authors.
Copyright transfer agreement can be found here: [Copyright transfer agreement in doc] and [Copyright transfer agreement in pdf].
JSINBIS (Jurnal Sistem Informasi Bisnis) is published by the Magister of Information Systems, Post Graduate School Diponegoro University. It has e-ISSN: 2502-2377 dan p-ISSN: 2088-3587 . This is a National Journal accredited SINTA 2 by RISTEK DIKTI No. 48a/KPT/2017.
Journal JSINBIS which can be accessed online by http://ejournal.undip.ac.id/index.php/jsinbis is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
View My Stats