Enhancing Sentiment and Emotion Classification with LSTM-Based Semi-Supervised Learning

Rochmat Husaini; Nur Heri Cahyana; Wisnalmawati Wisnalmawati; Tri Mardiana; Yuli Fauziah

doi:10.28989/compiler.v14i1.2965

User

About The Authors

Rochmat Husaini
https://orcid.org/0009-0008-5148-0389

Universitas Pembangunan Nasional Yogyakarta
Indonesia

Department of Informatics, Faculty of Industrial Technics

Nur Heri Cahyana
Universitas Pembangunan Nasional Yogyakarta
Indonesia

Department of Informatics, Faculty of Industrial Technics

Wisnalmawati Wisnalmawati
Universitas Pembangunan Nasional Yogyakarta
Indonesia

Department of Management, Faculty of Economic and Business

Tri Mardiana
Universitas Pembangunan Nasional Yogyakarta
Indonesia

Department of Management, Faculty of Economic and Business

Yuli Fauziah
Universitas Pembangunan Nasional Yogyakarta
Indonesia

Department of Informatics, Faculty of Industrial Technics

Article Tools

Indexing metadata

How to cite item

Supplementary files

Referencing Tool

Checked By

Member of

Contact Us

Editorial Team

Reviewer

E-ISSN

P-ISSN

Author Fees

Policy Review

Open Access Statement

Focus and Scope

Author Guidelines

Screening Plagiarism

Announcements

Abstracting and Indexing

Publication Ethics

Copyright Notice

Crossmark Policy

Timeline

Citedness in Scopus

Notifications

Journal Content
Browse

Information

Enhancing Sentiment and Emotion Classification with LSTM-Based Semi-Supervised Learning

Rochmat Husaini, Nur Heri Cahyana, Wisnalmawati Wisnalmawati, Tri Mardiana, Yuli Fauziah

Submitted : 2025-05-04, Published : 2025-06-13.

Abstract

The evolution of sentiment analysis has increasingly relied on semi-supervised learning (SSL) models, particularly due to their efficiency in utilizing large amounts of unlabeled data. This study employed four Indonesian datasets—Ridife (sentiment classification), Emotion Indonlu (emotion classification), Sentiment Indonlu (sentiment classification), and Hate Speech (offensive content detection). The LSTM model was trained using labeled data and used to generate pseudo-labels for unlabeled data across three iterations. The performance of the pseudo-labels was evaluated using Random Forest, Logistic Regression, and Support Vector Machine (SVM). The LSTM model demonstrated varying effectiveness across different datasets. For the Sentiment Ridife dataset, LSTM achieved an accuracy of 70.23%, slightly lower than Random Forest but higher than Logistic Regression and SVM. In the Sentiment IndoNLU dataset, LSTM's accuracy was 86.12%, showing strong performance but slightly below Random Forest and Logistic Regression. The Emotion IndoNLU dataset revealed similar performance across models, while the Hate Speech dataset saw LSTM perform well with an accuracy of 86.49%. The results indicate that while LSTM-based SSL can effectively generate pseudo-labels and enhance model performance, its performance varies depending on the dataset and task. This study underscores the need for further research into optimizing pseudo-labeling techniques and exploring advanced NLP models to improve sentiment and emotion analysis in diverse languages.

Keywords

Semi-supervised Learning; LSTM; Sentiment Analysis

Full Text:

PDF Check Plagiarism PDF

References

V. L. Shan Lee, K. H. Gan, T. P. Tan, and R. Abdullah, “Semi-supervised learning for sentiment classification using small number of labeled data,” *Procedia Computer Science*, vol. 161, pp. 577–584, 2019. [Online]. Available: https://doi.org/10.1016/j.procs.2019.11.159

P. Sudhir and V. D. Suresh, “Comparative study of various approaches, applications and classifiers for sentiment analysis,” *Global Transitions Proceedings*, vol. 2, no. 2, pp. 205–211, 2021. [Online]. Available: https://doi.org/10.1016/j.gltp.2021.08.004

A. S. Aribowo, H. Basiron, and N. F. A. Yusof, “Semi-supervised learning for sentiment classification with ensemble multi-classifier approach,” *International Journal of Advances in Intelligent Informatics*, vol. 8, no. 3, pp. 349–361, 2022.

T. N. Fatyanosa and F. A. Bachtiar, “Classification method comparison on Indonesian social media sentiment analysis,” in *Proc. 2017 Int. Conf. Sustainable Information Engineering and Technology (SIET)*, 2018, pp. 310–315.

Y. Li, Y. Lv, S. Wang, J. Liang, J. Li, and X. Li, “Cooperative hybrid semi-supervised learning for text sentiment classification,” *Symmetry*, vol. 11, no. 2, pp. 1–17, 2019.

D. A. K. Khotimah and R. Sarno, “Sentiment analysis of hotel aspect using probabilistic latent semantic analysis, word embedding and LSTM,” *International Journal of Intelligent Engineering and Systems*, vol. 12, no. 4, pp. 275–290, 2019.

I. Guellil, F. Azouaou, and F. Chiclana, “ArAutoSenti: automatic annotation and new tendencies for sentiment classification of Arabic messages,” *Social Network Analysis and Mining*, vol. 10, no. 1, 2020. [Online]. Available: https://doi.org/10.1007/s13278-020-00688-x

A. Al-Laith, M. Shahbaz, H. F. Alaskar, and A. Rehmat, “AraSenCorpus: A semi-supervised approach for sentiment annotation of a large Arabic text corpus,” *Applied Sciences (Switzerland)*, vol. 11, no. 5, pp. 1–19, 2021.

Y. Fauziah, S. Saifullah, and A. S. Aribowo, “Design text mining for anxiety detection using machine learning based-on social media data during COVID-19 pandemic,” *Proc. LPPM UPN “Veteran” Yogyakarta Conf. Series 2020 – Engineering and Science Series*, vol. 1, no. 1, pp. 253–261, 2020.

C. R. Aydin and T. Güngör, “Sentiment analysis in Turkish: Supervised, semi-supervised, and unsupervised techniques,” *Natural Language Engineering*, vol. 27, no. 4, pp. 455–483, 2021.

W. Maharani, “Sentiment analysis during Jakarta flood for emergency responses and situational awareness in disaster management using BERT,” in *Proc. 2020 8th Int. Conf. Information and Communication Technology (ICoICT)*, 2020.

S. Khomsah, N. H. Cahyana, and A. S. Aribowo, “Hyperparameter tuning of semi-supervised learning for Indonesian text annotation,” *International Journal of Advanced Computer Science and Applications*, vol. 14, no. 9, pp. 250–256, 2023.

H. Jayadianti, W. Kaswidjanti, A. Tri, and S. Saifullah, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” *ILKOM Jurnal Ilmiah*, vol. 14, no. 3, pp. 348–354, 2022.

H. Ahmadian, T. F. Abidin, H. Riza, and K. Muchtar, “Hybrid models for emotion classification and sentiment analysis in Indonesian language,” *Applied Computational Intelligence and Soft Computing*, vol. 2024, 2024.

M. O. Ibrohim and I. Budi, “Multi-label hate speech and abusive language detection in Indonesian Twitter,” in *Proc. Third Workshop on Abusive Language Online*, 2019, pp. 46–57.

A. S. Aribowo, H. Basiron, N. S. Herman, and S. Khomsah, “An evaluation of preprocessing steps and tree-based ensemble machine learning for analysing sentiment on Indonesian YouTube comments,” *International Journal of Advanced Trends in Computer Science and Engineering*, vol. 9, no. 5, pp. 7078–7086, 2020. [Online]. Available: ttps://www.scopus.com/inward/record.uri?eid=2-s2.085092659939&doi=10.30534%2Fijatcse%2F2020%2F29952020&partnerID=40&md5=92529b 57f447b0e2b2c06d43c90bbdc7

S. Khomsah and A. S. Aribowo, “Model semi-supervised learning menggunakan logistic regression untuk anotasi sentimen,” *Open Access Ledger*, vol. 1, no. 4, pp. 171–178, 2022.

W. Wijiyanto, A. I. Pradana, S. Sopingi, and V. Atina, “Teknik K-fold cross validation untuk mengevaluasi kinerja mahasiswa,” *Jurnal Algoritma*, vol. 21, no. 1, pp. 239–248, 2024.

http://dx.doi.org/10.28989/compiler.v14i1.2965

Article Metrics

Abstract view: 394 times

Download : 160 times

Download : 57 times

This work is licensed under a Creative Commons Attribution 4.0 International License.

Refbacks

There are currently no refbacks.

Compiler

P-ISSN 2252-3839 and E-ISSN 2549-2403

Informatic Departmen

Adisutjipto Institute of Aerospace Technology

Jl. Majapahit, Blok-R, Lanud Adisutjipto Yogyakarta

Phone : +62 274 451262 (Hunting) and +62 274 451263 Fax : +62274451265

Web Design	:	Public Knowledge Project	Themes	:	Mason Publishing OJS theme
Cover Design	:	Salam Aryanto	Under management	:	Informatika ITDA
Banner Design	:	Salam Aryanto	Licensed Under	:

Page View : and Compiler Statistics

Username
Password
Remember me