Building Resilient Data Pipelines

International Journal of Computer Techniques Logo
International Journal of Computer Techniques
ISSN 2394-2231
Volume 12, Issue 5  |  Published: September – October 2025
Author
Priyanka Kulkarni

Abstract

Cloud-native data pipelines form the nervous system of modern enterprises, enabling real-time decision-making, analytics, and digital services. However, these pipelines are vulnerable to fragilities such as cloud service outages, misconfigurations, network bottlenecks, and dependency failures. This paper proposes a holistic resilience engineering framework tailored for Amazon Web Services (AWS)-based data pipelines. Drawing from resilience theory, distributed systems design, and DevOps practices, the framework addresses proactive measures for availability, scalability, and continuity. Results highlight that proactive resilience engineering can reduce operational overhead, improve availability by up to 2 percentage points, and ensure business continuity with quantifiable reductions in downtime and cost overhead. A discussion on trade-offs (e.g., added complexity, governance burden, and upfront investment) provides a balanced perspective for practitioners. Future directions explore multi-cloud architectures, machine learning-driven anomaly detection, and automated compliance governance.

Keywords

AWS, Cloud Data Pipelines, Resilience Engineering, Fault Tolerance, DevOps, Continuity Planning.

Conclusion

This study has presented a resilience engineering framework for AWS pipelines, validated with both quantitative and qualitative data. By addressing four fragility sources, the framework provides organizations with a systematic approach to improving availability, reducing overhead, and ensuring continuity.The inclusion of quantitative results strengthens the claim that resilience yields tangible operational gains. Discussion of trade-offs grounds these findings in practical reality, acknowledging costs and complexities. The elaborated future work highlights promising directions for academia and industry alike.Resilience, then, is not a luxury—it is the defining characteristic of digital continuity. As enterprises embed data pipelines deeper into critical operations, adopting resilience engineering practices will determine not only uptime, but long-term organizational survival.

References

Basiri, A., Behnam, N., Hochstein, L., Kosewski, L., Reynolds, J., & Rosenthal, C. (2016). Chaos engineering. IEEE Software, 33(3), 35–41.Hollnagel, E. (2017). Safety-II in practice: Developing the resilience potentials. Routledge. Sharma, S., Adhikari, M., & Banerjee, S. (2021). Reliability modeling of cloud systems: A survey of practices and challenges. Journal of Cloud Computing, 10(1), 1–23. Zhang, Y., Wu, J., & Hu, X. (2019). Multi-region failover strategies in cloud applications. Future Generation Computer Systems, 95, 624–635.

IJCT Important Links

© 2025 International Journal of Computer Techniques (IJCT).