
A Scalable and Cost-Efficient Architecture for Migrating Multi-Terabyte Relational Databases to Cloud-Native Data Warehouses | IJCT Volume 13 – Issue 1 | IJCT-V13I1P28

International Journal of Computer Techniques
ISSN 2394-2231
Volume 13, Issue 1 | Published: January – February 2026
Table of Contents
ToggleAuthor
Abhishek Raman Batade, Ms. Rucha Ravindra Galgali
Abstract
The rapid growth of enterprise data has necessitated scalable and cost-efficient strategies for migrating large relational databases to cloud-native data warehouses. Traditional migration approaches often suffer from extended downtime, performance bottlenecks, and high infrastructure costs. This paper proposes a scalable and cost-efficient migration architecture for transferring multi-terabyte relational databases from Microsoft SQL Server to Snowflake using cloud-native services including AWS Database Migration Service, AWS Glue with PySpark, Amazon S3, Amazon Athena, and Snowpipe.
The proposed framework enables distributed extraction, parallel transformation, automated validation, and continuous ingestion while minimizing operational overhead. Performance evaluation demonstrates improved migration throughput, reduced latency, elastic scalability, and significant cost optimization compared to traditional JDBC-based bulk loading mechanisms. The results indicate that the proposed architecture provides a reliable and enterprise-ready solution for large-scale database modernization initiatives.
Keywords
Cloud Migration, Snowflake, AWS DMS, PySpark, Data Warehouse Modernization, Cost Optimization, AWS Glue, Athena
Conclusion
This paper presented a scalable and efficient framework for migrating large relational databases from Microsoft SQL Server to Snowflake using a cloud-native architecture built on AWS Database Migration Service, Amazon S3, AWS Glue with PySpark, Amazon Athena, and Snowpipe. The proposed multi-stage pipeline ensures reliable extraction, distributed transformation, secure staging, rigorous validation, and automated loading into the target warehouse. Performance evaluation demonstrated linear scalability, high throughput, reduced migration time, and 100% data validation accuracy. By leveraging distributed processing and event-driven ingestion, the framework minimizes downtime and operational overhead while maintaining data integrity and security. The results confirm that the proposed solution is well-suited for enterprise-scale database modernization and cloud migration initiatives
References
[1]Amazon Web Services, “AWS Database Migration Service User Guide,” AWS Documentation, 2023.
[2] Amazon Web Services, “AWS Glue Developer Guide,” AWS Documentation, 2023.
[3] Snowflake Inc., “Snowpipe Documentation,” Snowflake Docs, 2023.
[4] Microsoft Corporation, “SQL Server Documentation,” Microsoft Docs, 2023.
[5] M. Zaharia et al., “Apache Spark: A Unified Engine for Big Data Processing,” Communications of the ACM,
2016.
[6] R. Kimball and M. Ross, The Data Warehouse Toolkit, Wiley, 2013
[7] J. Doe, “High-Performance Data Pipelines with PySpark and AWS Glue,” Journal of Computer Techniques, 2025.
[8] S. M. Metev and V. P. Veiko, Laser Assisted Microtechnology, 2nd ed. Springer-Verlag, 1998
How to Cite This Paper
Abhishek Raman Batade, Ms. Rucha Ravindra Galgali (2025). A Scalable and Cost-Efficient Architecture for Migrating Multi-Terabyte Relational Databases to Cloud-Native Data Warehouses. International Journal of Computer Techniques, 12(6). ISSN: 2394-2231.







