AI-Powered Multi-City Air Quality Forecasting Platform | IJCT Volume 13 – Issue 2 | IJCT-V13I2P27

International Journal of Computer Techniques
ISSN 2394-2231
Volume 13, Issue 2  |  Published: March – April 2026

Author

Venkata Tharun Rajana, Prof.(Dr.) Ravi Kiran, Rayudu Sadhana, Pandrangi Anjan Sai, Kambham Sunandh

Abstract

Air pollution, particularly fine particulate matter (PM2.5), poses a significant public health threat to rapidly industrialising coastal cities. Visakhapatnam (Vizag), Andhra Pradesh, India, hosts major steel, petroleum, and port industries that contribute substantially to ambient PM2.5 concentrations. Accurate short-term forecasting of PM2.5 is essential for timely government advisories and public health interventions. This paper presents a comparative study of four predictive frameworks applied to hourly PM2.5 forecasting in Visakhapatnam: (i) Long Short-Term Memory (LSTM) networks, (ii) Transformer-based sequence models with multi-head self-attention, (iii) Extreme Gradient Boosting (XGBoost), and (iv) a weighted ensemble of all three. The models are trained on a two-year historical dataset comprising 17,520 hourly records of air quality and meteorological variables sourced from the Open-Meteo API. Forty-two engineered features — including temporal encodings, lag variables, rolling statistics, and meteorological interaction terms — are used as inputs to predict the next-hour PM2.5 concentration. Experimental results demonstrate that LSTM achieves the best individual performance (RMSE = 3.52 µg/m³, R² = 0.966), significantly outperforming Transformer (RMSE = 6.10, R² = 0.897) and XGBoost (RMSE = 6.61, R² = 0.879). The ensemble model attains RMSE = 4.45 µg/m³ and R² = 0.945, showing improved robustness over XGBoost and Transformer while remaining competitive with LSTM. The proposed system is integrated into a real-time Streamlit-based dashboard powered by a Groq LLM API for natural language air quality advisory generation. Results validate the superiority of deep sequential models for urban air quality time-series prediction and highlight the promise of hybrid ensemble approaches for robust operational deployment.

Keywords

PM2.5 forecasting, air quality prediction, LSTM, Transformer, XGBoost, ensemble learning, deep learning, time-series, Visakhapatnam, urban air pollution

Conclusion

This study presented a comprehensive comparative evaluation of LSTM, Transformer, XGBoost, and ensemble approaches for hourly PM2.5 air quality forecasting in Visakhapatnam, India — a heavily industrialised coastal city with a complex pollution regime. The LSTM model achieved superior performance (RMSE = 3.52 µg/m³, R² = 0.966) among all evaluated architectures, demonstrating the value of deep sequential modelling for urban air quality time-series. The Transformer model showed moderate performance (R² = 0.897), constrained by the relatively small dataset, while XGBoost (R² = 0.879) demonstrated the importance of a well-designed feature engineering pipeline. The ensemble model (R² = 0.945) provided a robust intermediate option with improved stability over individual weaker models. Future research directions include: (i) extension to multi-step (6h, 12h, 24h) forecasting horizons using encoder-decoder architectures; (ii) incorporation of satellite-derived AOD and land-use regression data to capture spatial heterogeneity within the city; (iii) application of federated learning to combine data from multiple monitoring stations while preserving data privacy; and (iv) evaluation of the LLM advisory system for comprehension and utility through user studies with residents and public health officials. The proposed framework provides a replicable and scalable template for data-driven air quality forecasting systems in Indian industrial cities.

References

[1] WHO (2021). WHO global air quality guidelines: particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide. World Health Organization, Geneva. [2] CPCB (2022). National Ambient Air Quality Standards. Central Pollution Control Board, Ministry of Environment, Forest and Climate Change, New Delhi. [3] Zheng, Y., et al. (2015). Forecasting fine-grained air quality based on big data. Proceedings of KDD 2015, pp. 2267–2276. [4] Grell, G. A., et al. (2005). Fully coupled online chemistry within the WRF model. Atmospheric Environment, 39(37), 6957–6975. [5] Bellinger, C., et al. (2017). A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health, 17(1), 1–19. [6] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. [7] Vaswani, A., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30. [8] IMD (2023). Climatological normals for Visakhapatnam, 1991–2020. India Meteorological Department. [9] Gu, K., et al. (2018). No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing, 21(12), 4695–4708. [10] Hochreiter, S., & Schmidhuber, J. (1997). LSTM Networks. Neural Computation. [11] Li, X., et al. (2017). Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. Environmental Pollution, 231, 997–1004. [12] Zhou, H., et al. (2019). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. AAAI 2021, 11106–11115. [13] Vaswani, A., et al. (2017). Attention is all you need. NeurIPS 2017. [14] Lim, B., et al. (2021). Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4), 1748–1764. [15] Zhou, H., et al. (2021). Informer: Beyond efficient transformer for long sequence time-series forecasting. AAAI, 35(12), 11106–11115. [16] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of KDD, pp. 785–794. [17] Chen, Z., et al. (2020). A hybrid model for PM2.5 prediction using a gradient boosting machine and a recurrent neural network. Environmental Research Letters. [18] Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. WIREs Data Mining and Knowledge Discovery, 8(4), e1249. [19] Sharma, E., et al. (2020). Ensemble prediction framework for PM2.5 in Delhi. IEEE Access, 8, 214210–214223. [20] Kumar, A., & Goyal, P. (2011). Forecasting of daily air quality index in Delhi. Science of the Total Environment, 409(24), 5517–5523. [21] Reddy, M. V., et al. (2022). Machine learning based air quality index prediction for Hyderabad, India. Journal of Environmental Management, 318, 115617.

How to Cite This Paper

Venkata Tharun Rajana, Prof.(Dr.) Ravi Kiran, Rayudu Sadhana, Pandrangi Anjan Sai, Kambham Sunandh (2026). AI-Powered Multi-City Air Quality Forecasting Platform. International Journal of Computer Techniques, 13(2). ISSN: 2394-2231.

© 2026 International Journal of Computer Techniques (IJCT). All rights reserved.

Submit Your Paper