Sentiment Analysis of E-Commerce Product-Based Reviews Using a Machine Learning Approach A Case Study of Amazon.com | IJCT Volume 13 – Issue 2 | IJCT-V13I2P10
Sentiment Analysis of E-Commerce Product-Based Reviews Using a Machine Learning Approach A Case Study of Amazon.com | IJCT Volume 13 – Issue 2 | IJCT-V13I2P10
With the rapid growth of e-commerce and online shopping platforms, product reviews have become a crucial resource for consumers making informed purchasing decisions. These reviews contain valuable insights into customer sentiments and opinions about products and services, significantly influencing other customers’ choices. However, manually analysing large volumes of reviews is time-consuming and challenging. Machine learning offers an automated solution by extracting and classifying sentiments expressed in reviews. The primary objective of this research was to develop a sentiment analysis model using supervised machine learning to classify sentiments as positive or negative. The Amazon Fine Food dataset from Kaggle was utilised, comprising 568,454 reviews, from which a random sample of 5,685 reviews was drawn for analysis. Data preprocessing involved removing special characters and HTML tags, handling missing values, and eliminating stop words. Five supervised machine learning models were employed and compared: Logistic Regression, Random Forest, Decision Tree, Naive Bayes, and Support Vector Machine (SVM). The dataset was partitioned into training (75%), validation (25%), and test sets (20% of the total dataset). Models were evaluated using accuracy, precision, recall, F1-score, and classification reports. Logistic Regression achieved the highest accuracy (64.47%) and demonstrated the best overall performance. Aspect-based sentiment analysis was also explored to provide granular insights into specific product features. Results were visualised using word clouds, pie charts, bar charts, and heat maps. Logistic Regression was selected for deployment, with the model and TF-IDF vectorizer
saved using the pickle library for efficient processing of new data. This research assists both consumers and businesses by facilitating informed decisions and enhancing product quality. Future work could explore deep learning architectures, advanced feature engineering, and resampling techniques such as SMOTE and ADASYN to address class imbalance.
This research developed and evaluated a machine learning-based sentiment analysis system for Amazon product reviews. The following conclusions are drawn:
25.Logistic Regression is the most suitable model for this classification task, achieving the highest accuracy (64.47%) and the most balanced performance metrics among the four models tested.
26.All models struggled with minority-class classification due to class imbalance in the dataset, highlighting the need for resampling techniques in future work.
27.TF-IDF vectorisation with unigram features provided a solid baseline for text representation, though more advanced techniques may yield improved performance.
28.The aspect-based sentiment analysis component provided actionable insights into specific product features, offering businesses granular customer feedback beyond overall sentiment labels [15].
29.The deployed Logistic Regression model successfully classified real-world customer reviews, demonstrating practical utility for automated sentiment monitoring in e-commerce contexts.
References
[1] Liu, B. (2020). Sentiment analysis: Mining opinions, sentiments, and emotions (2nd ed.). Cambridge University Press.
[2] Cambria, E., Das, D., Bandyopadhyay, S., & Feraco, A. (2022). A practical guide to sentiment analysis. Springer.
[3] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
[4] Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.
[5] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.
[6] IBM Cloud Education. (2020). Machine learning. IBM. https://www.ibm.com/cloud/learn/machine-learning
[7] Upma, S., Anand, M., & Agrawal, J. (2017). Sentiment analysis of mobile phone reviews using SVM. International Journal of Computer Applications, 164(9), 34–38.
[8] Shivaprasad, S., & Shetty, J. (2017). Sentiment analysis of product reviews: A review. In Proceedings of the International Conference on Inventive Communication and Computational Technologies (pp. 298–301).
[9] Ali Fauzi, M. (2018). Random forest approach for sentiment analysis in the Indonesian language. Indonesian Journal of Electrical Engineering and Computer Science, 12(1), 46–50.
[10] Ali Fauzi, M., Rofiqoh, U., & Alam, C. N. (2019). Improving SVM-based short text sentiment analysis for the Indonesian language with Word2Vec. Journal of Physics: Conference Series, 1196(1), 012060.
[11] Khanvilkar, G., & Vora, D. (2018). Ordinal sentiment classification and product recommendation using SVM and Random Forest. International Journal of Engineering and Technology, 7(3), 1450–1455.
[12] Sanjay Dey, M., Saha, S., & Chowdhury, S. (2020). Comparative analysis of Naive Bayes and SVM for sentiment analysis of customer reviews. Journal of Ambient Intelligence and Humanised Computing, 11(5), 2189–2198.
[13] Surya Prabha, P. M., & Subbulakshmi, B. (2019). Amazon product review sentiment analysis using Naive Bayes. In Proceedings of the IEEE International Conference on System, Computation, Automation and Networking (pp. 1–5).
[14] Yordanova, S., & Kabakchieva, D. (2017). Customer opinion prediction using supervised machine learning. International Journal of Reasoning-based Intelligent Systems, 9(1–2), 76–84.
[15] Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., & Manandhar, S. (2015). SemEval-2014 Task 4: Aspect-based sentiment analysis. In Proceedings of SemEval-2014 (pp. 27–35).
[16] Martin, J. R., & White, P. R. R. (2005). The language of evaluation: Appraisal in English. Palgrave Macmillan.
[17] Chevalier, J. A., & Mayzlin, D. (2006). The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research, 43(3), 345–354.
[18] Mudambi, S. M., & Schuff, D. (2010). What makes a helpful online review? A study of customer reviews on Amazon.com. MIS Quarterly, 34(1), 185–200.
[19] Sarkar, D. (2021). Text analytics with Python: A practitioner’s guide to natural language processing (2nd ed.). Apress.
[20] Hua, Y., Yin, Y., & Chen, X. (2023). Challenges in applying traditional sentiment models to social media data. Journal of Information Science, 49(1), 112–127.
How to Cite This Paper
Duru Juliet Chinenye, Ogbuagu Chinedu Samuel, Chima Aguocha Obingonye, Praise Madumere Chukwubueze (2026). Sentiment Analysis of E-Commerce Product-Based Reviews Using a Machine Learning Approach A Case Study of Amazon.com. International Journal of Computer Techniques, 13(2). ISSN: 2394-2231.