Sentiment Analysis of Distance Learning Using the K-Nearest Neighbor Method

Ni Wayan Devina Maharani; Fitrianingsih Fitrianingsih

doi:10.46799/ijssr.v6i4.1378

Authors

Ni Wayan Devina Maharani Universitas Gunadarma, Indonesia
Fitrianingsih Fitrianingsih Universitas Gunadarma, Indonesia

DOI:

https://doi.org/10.46799/ijssr.v6i4.1378

Keywords:

Sentiment Analysis, Twitter, Distance Learning, K-Nearest Neighbor

Abstract

During the pandemic, the Indonesian government issued a Distance Learning (PJJ) policy to reduce the spread of COVID-19. Many people expressed opinions about the pros and cons of the implementation of distance learning policies through social media, one of which is Twitter. These opinions can then be processed by conducting sentiment analysis. In this study, researcher will implement the K-Nearest Neighbor method to conduct sentiment analysis on Twitter regarding distance learning. The initial stage of the research is collecting tweets from Twitter as many as 1014 data. The next stage is labeling the dataset manually, which is then followed by the preprocessing stage which consists of data cleaning, case folding, tokenization, normalization, stopword removal and stemming. The dataset is further divided into two, namely train data and test data using an 8:2 ratio, where 80% is used as train data and 20% is used as test data. The K-Nearest Neighbor model is then built with several different hyperparameters. The KNN model evaluated using test data. The calculation of the accuracy value between the prediction sentiment and the actual sentiment of the test data is done using confusion matrix. The results of data classification using the K-Nearest Neighbor method with the most optimal hyperparameter resulted in an accuracy of 74.38%. The results of the study are expected to be able to classify positive and negative sentiment within sentences with the best accuracy so that the results of this study can help the government regarding distance learning policies during the pandemic.

References

Adriani, M., Asian, J., Nazief, B., Tahaghoghi, S. M. M., & Williams, H. E. (2007). Stemming Indonesian: A confix-stripping approach. ACM Transactions on Asian Language Information Processing (TALIP), 6(4), 1–33.

Abijono, H., Santoso, P., dan Anggreini, N. L. (2021). Algoritma Supervised Learning Dan Unsupervised Learning dalam Pengolahan Data. Jurnal Teknologi Terapan: G-Tech, 4(2), 315–318.

Berry, M. W., dan Kogan, J. (2010). Text Mining Application and Theory. United Kingdom: WILEY.

Binanto, I. (2010). Multimedia Digital-Dasar Teori dan Pengembangannya. Yogyakarta: Andi.

Chomboon, K., Chujai, P., Teerarassamee, P., Kerdprasop, K., dan Kerdprasop, N. (2015). An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm. In Proceedings of the 3rd international conference on industrial application engineering, 280-285.

Dragut, E., Fang, F., Sistla, P., Yu, C., dan Meng, W. (2009). Stop Word and Related Problem in Web Interface Integration. California: VLDB Endowment.

Gorunescu, F. (2011). Data Mining: Concepts, models and techniques (Vol. 12). Berlin: Springer Science & Business Media.

Grover Feed, J. (2019). Perceiving Python programming paradigms. Opensource.Com. https://opensource.com/article/19/10/python-programming-paradigms

Khan, F., Kanwal, S., Alamri, S., dan Mumtaz, B. (2020). Hyper-Parameter Optimization of Classifiers, Using an Artificial Immune Network and Its Application to Software Bug Prediction. IEEE Access, 8, 20954–20964.

Liantoni, F. (2016). Klasifikasi Daun Dengan Perbaikan Fitur Citra Menggunakan Metode K-Nearest Neighbor. Jurnal ULTIMATICS, 7(2), 98–104.

Liu, B. (2010). Handbook of Natural Lenguage Processing 2nd Edition. Boca Raton: CRC Press.

Lubis, A. R., Lubis, M., dan Khowarizmi, A. (2020). Optimization of distance formula in K-Nearest Neighbor method. Bulletin of Electrical Engineering and Informatics, 9(1), 326–338.

Nurfarida, R. D., Indriati, I., dan Perdana, R. S. (2018). Klasifikasi Kemacetan Lalu Lintas di Kota Malang Pada Sosial Media Twitter Menggunakan Metode Improved K-Nearest Neighbor. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 3(2), 1235–1242.

Prasetyo, E. (2012). Data mining konsep dan aplikasi menggunakan matlab. Yogyakarta: Andi.

Powers, D. M. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.

Ranatarisza, M. M., dan Noor, M. A. (2013). Sistem Informasi Akuntansi pada Aplikasi Administrasi Bisnis. Malang: Universitas Brawijaya Press.

Rezwanul, M., Ali, A., dan Rahman, A. (2017). Sentiment Analysis on Twitter Data using KNN and SVM. International Journal of Advanced Computer Science and Applications, 8(6), 19–25.

Septian, J. A., Fachrudin, T. M., dan Nugroho, A. (2019). Analisis Sentimen Pengguna Twitter Terhadap Polemik Persepakbolaan Indonesia Menggunakan Pembobotan TF-IDF dan K-Nearest Neighbor. Journal of Intelligent System and Computation, 1(1), 43–49.

Sudira, H., Diar, A. L., dan Ruldeviyani, Y. (2019). Instagram Sentiment Analysis with Naive Bayes and KNN: Exploring Customer Satisfaction of Digital Payment Services in Indonesia. 2019 International Workshop on Big Data and Information Security, IWBIS 2019, 21–26.

Weiss, S. M., Indurkhya, N., Zhang, T., dan Damerau, F. (2010). Text mining: predictive methods for analyzing unstructured information. Berlin: Springer Science & Business Media

Sentiment Analysis of Distance Learning Using the K-Nearest Neighbor Method

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License