Detection of Hate Speech Code Mix Involving English and Other Nigerian Languages
DOI:
https://doi.org/10.51519/journalisi.v5i4.595Keywords:
Hate speech, Code-mix, Social Media, Support Vector Machine, Random ForestAbstract
Hate speech is a recurrent event and has become a cause for global concern. The proliferation of hate speech has recently become prevalent, breeding room for violence and discrimination against specific individuals or groups. In Nigeria, message masking (use of language-mix) has become the new normal, especially in disseminating hateful and inciting comments. Hence, there is a need to curb the spread over social media. Therefore, this research focuses on detecting hate speech on social media with a code-mix of English, Pidgin and any of the three major Nigerian languages (Hausa, Igbo and Yoruba). The research used two machine learning algorithms: Support Vector Machine (SVM) and Random Forest (RF). Data were collected from tweets on the EndSARS protest and the 2023 Nigerian elections. The major features were extracted, and the text was converted into vectors using TF-IDF and Bag-of-words (BoW), which were used to train and test the model. The result showed that SVM performed better in classifying hate speech than RF on both TF-IDF and BoW features, averaging 93.43% for accuracy, 93.70% for precision, 93.43% for recall, and 93.57% for F1-score.
Downloads
References
A. Guterres, "United nations strategy and plan of action on hate speech," United Nations, New York, NY, USA, 2019.
S. MacAvaney, H. R. Yao, E. Yang, K. Russell, N. Goharian and O. Frieder, "``Hate speech detection: Challenges and solutions," PLoS ONE, vol. 14, no. 8, pp. 1-16, 2019.
B. Ross, M. Rist, G. Carbonell, B. Cabrera, N. Kurowsky and W. Wojatzki, "Measuring the reliability of hate speech annotations: The case of the European refugee crisis," in Proceedings of NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication, Bochum, Germany, 2016.
C. E. Ring, "Hate speech IN social media: An exploration of the problem and its proposed solutions," Colorado, 2013.
E. C. o. H. Rights, "Annual Report 2017 of European Court of Human Rights, Council of Europe," ECHR, Strasbourg, France, 2017.
S. Abro, S. Shaikh, Z. H. Khand, Z. Ali, S. Khan and M. Ghulam, "Automatic Hate Speech Detection using Machine Learning: A Comparative Study," International Journal of Advanced Computer Science and Applications, (IJACSA), vol. 11, no. 8, pp. 1-8, 2020.
C. E. R. Salim and D. Suhartono, "A Systematic Literature Review of Different Machine Learning Methods on Hate Speech Detection," International Journal on Informatics Visualization, vol. 4, no. 4, pp. 1-6, 2020.
S. K. Mohapatra, S. Prasad, D. K. Bebarta, T. K. Das, K. Srinivasan and Y.-C. Hu, "Automatic Hate Speech Detection in English-Odia Code Mixed Social Media Data Using Machine Learning Techniques," Applied Science, vol. 11, pp. 1-21, 2021.
V. Pathak, M. Joshi, P. A. Joshi, M. Mundada and T. Joshi, "Using Machine Learning for Detection of Using Machine Learning for Detection of Social Media text," KBCNMUJAL, pp. 1-12, 2020.
H. Nayel and H. L. Shashirekha, "DEEP at HASOC2019: A Machine Learning Framework for Hate Speech and Offensive Language Detection," in FIRE 2019, Kolkata, India., 2019.
N. Aulia and I. Budi, "Hate Speech Detection on Indonesian Long Text Documents Using Machine Learning Approach," in International Conference on Computing and Artificial Intelligence (ICCAI), Bali, Indonesia, 2019.
I. Aljarah, M. Habib, N. Hijazi, H. Faris, R. Qaddoura, B. Hammo, M. Abushariah and M. Alfawareh, "Intelligent detection of hate speech in Arabic social network: A machine learning approach," Journal of Information Science (JIS), vol. 47, no. 4, pp. 2-19, 2021.
F. D. Vigna, A. Cimino, F. Dell’Orletta, M. Petrocchi and M. Tesconi, "Hate me, hate me not: Hate speech detection on Facebook," in In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17), Venice, Italy, 2017.
B. Vidgen and T. Yasseri, "Detecting weak and strong Islamophobic hate speech on social media," Journal of Information Technology & Politics, pp. 1-14, 2019.
S. M. Aliyu, G. M. Wajiga, M. Murtala, S. H. Muhammad, I. Abdulmumin and I. S. Ahmad, "HERDPhobia: A Dataset for Hate Speech against Fulani in Nigeria," arXiv preprint arXiv:2211.15262., pp. 1-3, 2022.
M. Awad and R. Khanna, "Support Vector Machine for Classifiaction," in Efficient Learnhing Machines, Berkeley, CA., Apress, 2015, pp. 39-66.
A. W. Moore, "Tutorials," 19 February 2020. [Online]. Available: http://www.cs.cmu.edu/~awm/tutorials.html. [Accessed 19 February 2020].
V. Vapnik, S. Golowich and A. Smola, "Support vector method for function approximation, regression estimation, and signal processing," in In M. Mozer, M. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, Cambridge, MA, 1997.
R. Sutton and A. Barto, Learning: An Introduction, 1998.
N. Mohapatra, K. Shreya and A. Chinmay, "Optimization of the Random Forest Algorithm," in Advances in Data Science and Management. Lecture Notes on Data Engineering and Communications Technologies, vol. 37, Singapore, Springer, 2020, pp. 201-208.
Downloads
Published
Issue
Section
License
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














