An End-to-End Machine Learning Pipeline for Online Purchase Intention Prediction Using Random Forest and MLOps Practices

Akas Bagus Setiawan, Hendra Yufit Riskiawan, Hermawan Arief Putranto, Taufiq Rizaldi, Rachmad Andri Atmoko

Submitted : 2026-01-20, Published : 2026-02-20.

Abstract

Predicting online shoppers' purchase intention is a key issue in e-commerce because it directly affects conversion and marketing effectiveness. The main focus of this article is a Random Forest purchase-intention model accompanied by an end-to-end MLOps implementation to ensure production readiness. The dataset used is Online Shoppers Intention with 12,330 samples and 18 features representing administrative, informational, and product-related characteristics, along with behavioral metrics. Preprocessing includes missing-value imputation, numerical feature standardization, categorical feature encoding, and outlier removal using the z-score method. The model is optimized with GridSearchCV and 3-fold cross-validation. Test results show 91.38% accuracy with 73.60% precision, 56.64% recall, and 64.02% F1-score for the positive class. MLOps implementation uses MLflow for experiment tracking, Prometheus-Grafana for monitoring, and a GitHub Actions-based CI/CD pipeline for deployment automation. Overall, the Random Forest model delivers strong predictive performance on e-commerce data and is supported by an MLOps pipeline that improves reproducibility, deployment, and production monitoring

Keywords

Purchase Intention Prediction, Random Forest, Machine Learning, MLOps, MLflow

References

M. A. Musababa and M. Fachrie, “Data Streaming Pipeline Model Using DBSTREAM-Based Online Machine Learning for E-Commerce User Segmentation,” Journal of Applied Informatics and Computing, vol. 9, no. 6, pp. 3346–3355, Dec. 2025, doi: 10.30871/jaic.v9i6.11522.

S. Zhang, T. Mo, and Z. Zhang, “LightPersML: A Lightweight Machine Learning Pipeline Architecture for Real-Time Personalization in Resource-Constrained E-commerce Businesses,” Journal of Advanced Computing Systems, vol. 4, no. 8, pp. 44–56, 2024, doi: 10.69987/JACS.2024.40807.

Y. E. Gundogmus, S. Acikalin, and A. Rastak, “Customer Lifetime Value Prediction in E-Commerce: Machine Learning Approaches and Business Implications,” in Proc. Int. Symp. Innovations in Intelligent SysTems and Applications (INISTA), 2025.

O. R. S. G. Alamuri and C. K. Bondalapu, “Predicting E-Commerce Purchase Intention Using Machine Learning,” ASEAN J. Sci. Tech. Report., vol. 29, no. 1, e260159, 2026, doi: 10.55164/ajstr.v29i1.260159.

X. Ma and X. Jiang, “Predicting Cross-border E-commerce Purchase Behavior in Organic Products: A Machine Learning Approach Integrating Cultural Dimensions and Digital Footprints,” International Journal of Computer and Information System, vol. 5, no. 1, pp. 91–102, 2024, doi: 10.29040/ijcis.v5i1.212.

T. Nikhitha, S. S. Sameer, and D. Bhattacharya, “Online Shopping Purchase Intention Prediction Using Machine Learning,” in Proc. Int. Conf. Communication, Computer, and Information Technology (IC3IT), 2025.

A.-A. Tanvir, I. A. Khandokar, A. K. M. M. Islam, S. Islam, and S. Shatabda, “A Gradient Boosting Classifier for Purchase Intention Prediction of Online Shoppers,” Heliyon, vol. 9, no. 4, e15163, 2023, doi: 10.1016/j.heliyon.2023.e15163.

T. T. Nguyen, H. T. T. Truong, and T. Le-Anh, “Online Purchase Intention Under the Integration of Theory of Planned Behavior and Technology Acceptance Model,” SAGE Open, vol. 13, no. 4, art. 21582440231218814, 2023, doi: 10.1177/21582440231218814.

F. V. Ferdinand, L. Laurence, and K. N. Effendi, “Comparing Random Forest and XGBoost Machine Learning Models for Predicting Purchase Intention in Online Consumer Behavior: A Study in the Jabodetabek Area,” in Program & Abstract Book of CIC 2025 (CONMEDIA 2025), Malacca, Malaysia, Oct. 14–17, 2025, p. 94.

A. K. Prasad, D. K. M, V. D. J. Macedo, B. R. Mohan, and A. P. N, “Machine Learning Approach for Prediction of the Online User Intention for a Product Purchase,” Int. J. Recent Innov. Trends Comput. Commun., vol. 11, no. 1s, pp. 43–51, 2023, doi: 10.17762/ijritcc.v11i1s.5992.

J. Adhikari, “Online Shoppers’ Purchase Intention using Ensemble Learning Approach,” International Journal of Next-Generation Computing, vol. 14, no. 4, Nov. 2023, doi: 10.47164/ijngc.v14i4.1065.

C. Clarence and K. Keni, “The Prediction of Purchase Intention Based on Digital Marketing, Customer Engagement, and Brand Preference,” in Proc. 10th Int. Conf. Entrepreneurship and Business Management (ICEBM 2021), Adv. Econ., Bus. Manag. Res., Atlantis Press, 2022, pp. 481–486, doi: 10.2991/aebmr.k.220501.073.

R. Kale, K. Bidwai, M. Maske, R. Bansode, and P. Gurav, “Prediction of Customer Purchase Intention using Social Media Data,” Int. J. Adv. Res. Sci. Commun. Technol. (IJARSCT), vol. 2, no. 5, pp. 10–13, May 2022, doi: 10.48175/IJARSCT-4003.

Y. Tong, “The Influence of Online Celebrity Live Streaming on Consumers’ Purchasing Decisions,” Highlights in Business, Economics and Management, vol. 8, pp. 411–418, 2023, doi: 10.54097/hbem.v8i.7239.

S. Sikka and J. Kumar, “Understanding Indian Millennials’ Awareness Towards Brand Personality of Apparel Brands,” Small Enterprises Development, Management & Extension Journal, vol. 51, no. 1, pp. 41–53, 2024, doi: 10.1177/09708464231209451.

H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Babylonian Journal of Machine Learning, vol. 2024, pp. 69–79, 2024, doi: 10.58496/BJML/2024/007.

Z. Sun, G. Wang, P. Li, H. Wang, M. Zhang, and X. Liang, “An improved random forest based on the classification accuracy and correlation measurement of decision trees,” Expert Systems with Applications, vol. 237, pt. B, art. 121549, 2024, doi: 10.1016/j.eswa.2023.121549.

S. F. Neduet, A. H. Neduet, S. B. Amir, M. G. Z. Awan, S. H. Ahmed, and S. M. H. Aslam, “XGBoost and Random Forest Algorithms: An in Depth Analysis,” Pakistan Journal of Scientific Research, vol. 3, no. 1, pp. 26–31, 2023, doi: 10.57041/vol3iss1pp26-31.

H. Hairani, A. Anggrawan, and D. Priyanto, “Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link,” Int. J. Informatics Visualization, vol. 7, no. 1, pp. 258–264, 2023, doi: 10.30630/joiv.7.1.1069.

W. Zhang, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization,” Geoscience Frontiers, vol. 12, no. 1, pp. 469–477, 2021, doi: 10.1016/j.gsf.2020.03.007.

D. Kreuzberger, N. Kuhl, and S. Hirschl, “Machine Learning Operations (MLOps): Overview, Definition, and Architecture,” IEEE Access, vol. 11, pp. 31866–31879, 2023, doi: 10.1109/ACCESS.2023.3262138.

S. Pahune and Z. Akhtar, “Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models,” Information, vol. 16, no. 2, art. 87, 2025, doi: 10.3390/info16020087.

A. Sanchez-Mompo, I. Mavromatis, P. Li, K. Katsaros, and A. Khan, “Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations,” Information, vol. 16, no. 4, art. 281, 2025, doi: 10.3390/info16040281.

A. C. Cob-Parro, Y. Lalangui, and R. Lazcano, “Fostering Agricultural Transformation through AI: An Open-Source AI Architecture Exploiting the MLOps Paradigm,” Agronomy, vol. 14, no. 2, art. 259, 2024, doi: 10.3390/agronomy14020259.

P. Liang, B. Song, X. Zhan, Z. Chen, and J. Yuan, “Automating the training and deployment of models in MLOps by integrating systems with machine learning,” Applied and Computational Engineering, vol. 76, pp. 1–7, 2024, doi: 10.54254/2755-2721/76/20240690.

Article Metrics

Abstract view: 65 times
Download     : 4   times

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Refbacks

  • There are currently no refbacks.