Agile Methodologies for Data Engineering Teams: Adoption Patterns and Outcomes | IJCT Volume 9 – Issue 2 | IJCT-V9I2P44

International Journal of Computer Techniques
ISSN 2394-2231
Volume 9, Issue 2  |  Published: March – 2022

Author

Kuladeep Sandra

Abstract

Agile methods, articulated in the Agile Manifesto and operationalized in frameworks including Scrum, Extreme Programming, and Kanban, have been the dominant approach to organizing software engineering work for nearly two decades. Their adoption in data engineering has been slower and more uneven, and the explanation is not principally cultural. Software engineering Agile rests on conditions (clear acceptance criteria, testable outputs, deployment independence, low cross-team coupling) that data engineering only partially satisfies. Data engineering work involves infrastructure dependencies that block teams unpredictably, exploratory analyses whose duration cannot be bounded in advance, shared platform components whose changes affect all consumers simultaneously, and quality failures that surface weeks after the sprint in which the responsible code shipped. Teams that import software Agile patterns unchanged into data engineering settings tend to fail in characteristic ways: estimates shatter against unknowable infrastructure complexity, sprints become ceremonies that document the inability to plan rather than the ability to deliver, and morale erodes as the team learns to discount its own commitments. This paper surveys the Agile adoption patterns that have emerged in data engineering, the data-specific challenges that drive adaptation, and the measurable outcomes that adapted patterns produce. It addresses three lines of inquiry: (RQ1) what Agile adoption patterns exist in data engineering teams and how do these patterns emerge from the unique characteristics of data engineering work; (RQ2) what challenges are specific to Agile adoption in data engineering and what adaptations prove effective; and (RQ3) what measurable outcomes result from Agile adoption and how do teams reconcile productivity gains with the persistence of estimation challenges. The paper combines a structured survey of literature published through 2022 with two longitudinal practitioner case studies: a 2018 sprint failure (Sprint 7) in which over-commitment and inadequate risk identification produced a goal completion rate of around 42 percent and a painful retrospective; and a 2019–2020 platform enablement squad initiative that achieved about a 20 percent efficiency gain (measured as the ratio of story points delivered per sprint-week before and after the reorganization, averaged over six sprints) in domain team velocity at cost-neutral first-year economics. The closing argument is that Agile works for data engineering when its patterns are adapted intentionally to the conditions of data work; teams that adopt software Agile unchanged should expect failures of the kind documented here, and the failures are recoverable through the kinds of adaptations the case studies illustrate.

Keywords

Agile, Scrum, Kanban, data engineering, team organization, productivity, Agile adoption

Conclusion

This paper has surveyed the Agile adoption patterns observed in data engineering teams, examined the challenges that drive adaptation away from software-derived patterns, and presented two longitudinal case studies that illustrate both the failures of naive adoption and the outcomes of considered adaptation. The contribution combines a structured account of the literature with practitioner evidence from a multi-year evolution at a financial services data organization. The principal finding is that Agile works for data engineering, but only with intentional adaptation. The conditions that make software Agile workable (clear scope, testable outputs, deployment independence, low cross-team coupling) are only partially present in data engineering, and the patterns that succeed in data contexts are the patterns that acknowledge the differences rather than the patterns that pretend they do not exist. Pure Scrum is appropriate for small data engineering teams whose work resembles software work; hybrid Scrum-Kanban is appropriate for larger teams or teams with significant infrastructure dependencies; data mesh with platform enablement is appropriate at scale; the strangler fig pattern is appropriate for migration work regardless of the surrounding methodology. The choice among these patterns is principally a function of team size, infrastructure stability, and the organization’s tolerance for parallel methodologies. The case studies illustrate both the failure mode and the recovery. Sprint 7 in 2018 was the failure mode: a sprint that combined infrastructure migration with feature delivery, that was over-committed against unknown unknowns, that lacked formal risk identification and mid-sprint replanning, and that produced an 42 percent goal completion rate against a previously stable 65 percent baseline. The retrospective from that sprint produced the adaptations (range estimation, risk backlogs, mid-sprint checkpoints, platform enablement squads, strangler fig migration, quarterly outcome commitments) that subsequently became the team’s working model. The platform enablement squad case study from 2019–2020 illustrates the recovery and the cumulative effect of the adaptations: a 20 percent efficiency gain in domain team velocity, sprint goal completion rates in the 82 to 85 percent range, infrastructure latency for typical requests reduced from 2 to 3 weeks to 3 to 4 days, and cost-neutral first-year economics with positive ROI in subsequent years. The case studies also surface lessons that the broader literature has not yet engaged with in depth. Sprint 7 illustrates that estimation refinement does not solve the underlying problem of commitment under uncertainty; the workable response is to commit less, identify risks more deliberately, and provide mechanisms for honest mid-sprint replanning when surprises emerge. The platform enablement squad case study illustrates that the team topology of an Agile organization matters as much as its methodology: embedding platform engineers with domain teams produces velocity gains that no methodology refinement could have produced on its own, because the underlying friction was structural rather than methodological. The organizational restructuring from eight direct reports to five, aligning team boundaries with the data mesh model of domain ownership plus platform enablement, was a precondition for the methodology adaptations rather than a consequence of them. The cost-benefit picture is favorable in the case study context but should not be over-generalized. The 20 percent efficiency gain from the platform enablement squad reflects the specific baseline at the case study site and the specific causes of velocity loss in that environment; teams with different baselines and different causes should not expect the same magnitude of gain from the same intervention. The patterns described here are not a recipe; they are an existence proof that adapted Agile can work in data engineering contexts, accompanied by enough operational detail to make the patterns reproducible by teams whose contexts resemble the case study site’s. The closing observation, offered as a call to the practitioner community, is that data engineering Agile practices are not yet standardized in the way that software engineering Agile practices are. The literature is sparse, the metrics are inconsistent across organizations, and the failure modes are discussed less openly than they should be. Practitioners who have navigated similar transitions are encouraged to publish their experiences, including the failures, at the level of operational detail required to make the lessons transferable. Sprint 7 is a category of experience, not a singular event; the field will mature faster if more Sprint 7s are documented and analyzed in public rather than privately discussed and selectively forgotten.

References

[1] D. J. Anderson, Kanban: Successful Evolutionary Change for Your Technology Business. Blue Hole Press, 2010. [2] K. Beck, Extreme Programming Explained: Embrace Change. Addison-Wesley, 2000. [3] K. Beck, M. Beedle, A. van Bennekum, A. Cockburn, W. Cunningham, M. Fowler, J. Grenning, et al., “Manifesto for agile software development,” agilemanifesto.org, 2001. [4] B. Boehm and R. Turner, Balancing Agility and Discipline: A Guide for the Perplexed. Addison-Wesley, 2003. [5] A. Cockburn, Agile Software Development. Addison-Wesley, 2002. [6] M. Cohn, Agile Estimating and Planning. Prentice Hall, 2005. [7] M. Cohn, Succeeding with Agile: Software Development Using Scrum. Addison-Wesley, 2009. [8] K. Conboy, “Agility from first principles: Reconstructing the concept of agility in information systems development,” Inf. Syst. Res., vol. 20, no. 3, pp. 329–354, 2009. [9] Z. Dehghani, “How to move beyond a monolithic data lake to a distributed data mesh,” martinfowler.com, 2019. [10] K. Dikert, M. Paasivaara, and C. Lassenius, “Challenges and success factors for large-scale agile transformations: A systematic literature review,” J. Syst. Softw., vol. 119, pp. 87–108, 2016. [11] T. Dingsoyr, S. Nerur, V. Balijepally, and N. B. Moe, “A decade of agile methodologies: Towards explaining agile software development,” J. Syst. Softw., vol. 85, no. 6, pp. 1213–1221, 2012. [12] T. Dyba and T. Dingsoyr, “Empirical studies of agile software development: A systematic review,” Inf. Softw. Technol., vol. 50, nos. 9–10, pp. 833–859, 2008. [13] M. Fowler, “StranglerFigApplication,” martinfowler.com, 2004. [14] M. Fowler and M. Foemmel, “Continuous integration,” IEEE Softw., vol. 23, no. 4, pp. 75–78, 2006. [15] M. Fowler and J. Highsmith, “The agile manifesto,” Softw. Dev., vol. 9, no. 8, pp. 28–35, 2001. [16] J. Grenning, “Planning poker or how to avoid analysis paralysis while release planning,” Renaissance Softw. Consulting, 2002. [17] R. Hoda, J. Noble, and S. Marshall, “Self-organizing roles on agile software development teams,” IEEE Trans. Softw. Eng., vol. 39, no. 3, pp. 422–444, 2013. [18] J. Highsmith, Agile Project Management: Creating Innovative Products, 2nd ed. Addison-Wesley, 2009. [19] J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation. Addison-Wesley, 2010. [20] G. Kim, K. Behr, and G. Spafford, The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win. IT Revolution Press, 2013. [21] B. Kitchenham, “Procedures for performing systematic reviews,” Keele Univ., Tech. Rep. TR/SE-0401, 2004. [22] H. Kniberg and M. Skarin, Kanban and Scrum: Making the Most of Both. C4Media/InfoQ, 2010. [23] C. Larman and V. R. Basili, “Iterative and incremental developments: A brief history,” Comput., vol. 36, no. 6, pp. 47–56, 2003. [24] C. Larman and B. Vodde, Large-Scale Scrum: More with LeSS. Addison-Wesley, 2016. [25] D. Leffingwell, Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2010. [26] V. Mahnic, “Improving software development through combination of Scrum and Kanban,” in Proc. Recent Advances Comput. Eng., Commun. Inf. Technol., 2014, pp. 281–288. [27] R. P. Maranzato, M. Neubert, and P. Herculano, “Moving back to Scrum and Scrumban,” in Proc. 1st Int. Workshop Softw. Eng. Data Anal., 2012, pp. 17–20. [28] S. McConnell, Software Estimation: Demystifying the Black Art. Microsoft Press, 2006. [29] N. B. Moe, T. Dingsoyr, and T. Dyba, “A teamwork model for understanding an agile team: A case study of a Scrum project,” Inf. Softw. Technol., vol. 52, no. 5, pp. 480–491, 2010. [30] S. Nerur, R. Mahapatra, and G. Mangalaraj, “Challenges of migrating to agile methodologies,” Commun. ACM, vol. 48, no. 5, pp. 72–78, 2005. [31] M. Paasivaara, C. Lassenius, and V. T. Heikkila, “Inter-team coordination in large-scale globally distributed Scrum: Do Scrum-of-Scrums really work?” in Proc. 2012 ACM-IEEE Int. Symp. Empirical Softw. Eng. Meas., 2012, pp. 235–238. [32] K. Petersen and C. Wohlin, “A comparison of issues and advantages in agile and incremental development between state of the art and an industrial case,” J. Syst. Softw., vol. 82, no. 9, pp. 1479–1490, 2009. [33] R. Pichler, Agile Product Management with Scrum: Creating Products that Customers Love. Addison-Wesley, 2010. [34] M. Poppendieck and T. Poppendieck, Lean Software Development: An Agile Toolkit. Addison-Wesley, 2003. [35] D. G. Reinertsen, The Principles of Product Development Flow: Second Generation Lean Product Development. Celeritas Publishing, 2009. [36] L. Rising and N. S. Janoff, “The Scrum software development process for small teams,” IEEE Softw., vol. 17, no. 4, pp. 26–32, 2000. [37] K. S. Rubin, Essential Scrum: A Practical Guide to the Most Popular Agile Process. Addison-Wesley, 2012. [38] K. Schwaber, “SCRUM development process,” in Business Object Design and Implementation. Springer, 1995, pp. 117–134. [39] K. Schwaber, Agile Project Management with Scrum. Microsoft Press, 2004. [40] K. Schwaber and M. Beedle, Agile Software Development with Scrum. Prentice Hall, 2002. [41] K. Schwaber and J. Sutherland, “The Scrum guide: The definitive guide to Scrum: The rules of the game,” Scrum.org, 2017. [42] M. Skelton and M. Pais, Team Topologies: Organizing Business and Technology Teams for Fast Flow. IT Revolution Press, 2019. [43] I. Sommerville, Software Engineering, 9th ed. Addison-Wesley, 2010. [44] C. J. Stettina and J. Horz, “Agile portfolio management: An empirical perspective on the practice in use,” Int. J. Project Manag., vol. 33, no. 1, pp. 140–152, 2015. [45] J. Sutherland, Scrum: The Art of Doing Twice the Work in Half the Time. Crown Business, 2014. [46] H. Takeuchi and I. Nonaka, “The new new product development game,” Harvard Bus. Rev., vol. 64, no. 1, pp. 137–146, 1986. [47] CollabNet VersionOne, “13th annual state of agile report,” 2019. [48] K. Vlaanderen, S. Jansen, S. Brinkkemper, and E. Jaspers, “The agile requirements refinery: Applying Scrum principles to software product management,” Inf. Softw. Technol., vol. 53, no. 1, pp. 58–70, 2011. [49] J. Webster and R. T. Watson, “Analyzing the past to prepare for the future: Writing a literature review,” MIS Quart., vol. 26, no. 2, pp. xiii–xxiii, 2002. [50] L. Williams, “What agile teams think of agile principles,” Commun. ACM, vol. 55, no. 4, pp. 71–76, 2012. [51] M. Armbrust, et al., “Lakehouse: A new generation of open platforms that unify data warehousing and advanced analytics,” in Proc. CIDR 2021, 2021.

How to Cite This Paper

Kuladeep Sandra (2022). Agile Methodologies for Data Engineering Teams: Adoption Patterns and Outcomes. International Journal of Computer Techniques, 9(2). ISSN: 2394-2231.

© 2022 International Journal of Computer Techniques (IJCT). All rights reserved.

Submit Your Paper