The ever expanding backlog. The increasing pressure to deliver faster. All while your team burns the midnight oil, keeping the systems running while simultaneously striving to move things forward. Today’s data engineering managers are in a tough spot: having demonstrated the tremendous value of data innovations, the world naturally wants more, and faster.
Leading data teams is challenging. Few technological domains have undergone such rapid change over the past few years. Yet the vast majority of data teams, 96% to be exact, are at or over capacity. The #1 theme we hear from data leaders is the desire to go faster. To assist in that aspiration, I think it’s first helpful to spend time on how data teams get into hard spots to begin with, what managers can do to not only avoid those pitfalls, and how to get themselves out if they happen to find their team in such a situation.
The dreaded slow development cycle—it’s the bane of any engineering manager’s existence. Sometimes we don’t even realize it’s happening—like frogs in boiling water. Eventually we look around and realize a project that used to take one data engineer a couple of days is now taking a team of three nearly a month to complete. How could things have gotten so bad?
This loss of productivity drains time, resources, budget, and, ultimately, the morale of the engineering team that is already stretched to thin. So how is it that engineering managers, and by extension teams, get to this point? There are a few typical paths that lead to slowing development cycles to watch out for:
1. Allowing Innovations to Become Anchors
Every team eventually arrives at this point and must make a decision. Your previously innovative platform is now looking a lot more like legacy architecture. It was built using older technologies, different skill sets, or even different users entirely. What once was a shining example of innovation is now brittle, inflexible, and the thing nobody wants to touch. At this stage, many data engineering managers see only two paths to remediate.
- The “double down” method. The idea is that they have a system and can’t get away from it, the common reasons being time and/or resources. Instead, they’ll “simply” fix it by building a new abstraction layer or extension on top of the existing foundation. Teams often get stuck in this never ending cycle to keep the system running—it’s always far from optimal, and often borders on life support. It never entirely does what the team or the manager wants, but at least dragged the anchor a bit further. Unfortunately, these are nothing but band-aids, the end result invariably becomes a Frankenstein data platform, for which these teams are still on the hook to support, maintain, and improve upon even if it produces diminishing returns.
- The “v2” method. This typically happens after the double-down method has been in play for a while. The team is drained from supporting a massive system and proposes a larger rewrite of the entire system. These are always multi-quarter (if not longer) projects, but in the face of cratering productivity and morale, the engineering manager will frequently pitch this to the broader org and allow the team to build their new architecture.
This lift and shift rebuild is incredibly dangerous. It always takes longer than expected and was already scoped as a large time investment from the beginning. As a result, external pressures increase on the team during rewrites and results in these projects rarely being given the time needed to adequately complete. This leads to scope reduction, corners cut, deadlines missed, and the team ultimately in, at best, only a marginally better spot than where they began—a system that doesn’t meet your expectations and a burnt-out team from carrying the burden of a substandard data architecture, further disgruntled from not having the opportunity to complete something they’d be proud of.
How to fix: Fortunately, there is a third option: “incrementally evolve.” This one is near and dear to us at Ascend as “evolve with Intent” is one of our core values—and rightfully so. As engineers, we know that we probably can build just about anything we want, but we constantly ask ourselves whether we should. The answer to this question often changes quarter to quarter, and certainly year to year. As time progresses, we continually look for small (or better yet large) chunks of the platform to replace entirely leveraging more modern capabilities. In short, we ask ourselves: “what can we stop doing?” As time goes by, you’ll notice an increasingly large number of things you used to have to write code for that can now be handled by a new open source project or SaaS service. At Ascend, we find those pieces, and incrementally swap them out to free up our time and energy to focus on the highest impact capabilities.
Engineering time is too valuable to spend on non-differentiated work (a theme you’ll see repeated in this post). As technical fields mature, common patterns emerge, and new open source software and SaaS solutions are created to address these needs at far better unit economics.
2. Checkbox Driven Architectures
It’s an easy trap to fall into—especially when you have a diverse set of users and requirements. Teams often collect large lists of features from their various constituents, and then look to pull together a collection of capabilities that maximize the boxes checked. Users are fantastic at telling you how they want to accomplish something, but this is always based upon their present day view and causes you to risk perpetuating the cycles we mention above. Whether you’re building or buying technologies, starting with a list of in-depth features first is, in most cases, backwards. I’m not saying feature sets don’t matter—they absolutely do. They’re simply not where you should start.
How to fix: if we’re not supposed to start with features, what do we start with? Outcomes. What outcomes do you want to drive with your new architecture? Ask yourself, what matters most in the long run? When looking at new technologies, this often takes the form of stated outcomes such as:
- I want to halve the amount of time my team spends maintaining systems
- I want to enable 3x as many individuals in the company to contribute to our data products
- I want to to expand the talent pool by 5x for viable candidates for my team
- I want to decrease our time to deliver new projects by 3x
- Etc.
One of our team members recently stated “the ultimate measure of great code is its ability to change over time.” We think this is absolutely the case, and you have to simply look no further than the data ecosystem to see the pressing need for systems that must rapidly evolve. Data engineering managers—and really data engineering teams—have to embrace the fact that software will continually change. Formats, protocols, connectors, etc all change, and with relatively high frequency. What differentiates you is the ability to keep up with the pace of change. And what defines excellent data engineering isn’t simply efficient data pipelines; it’s the ability to be nimble and quickly adapt to the organization’s ever-changing environment.
3. Not Assigning Enough Value to Their Team's Time
This is one that we are incredibly passionate about and, on the flip side, one that we’re seeing happen more and more. Engineering managers are, understandably, rather budget conscious. However, they often place more value on the cost and efficiency of infrastructure, like servers, than they place on the efficiency of their engineers. It’s been truly fascinating to observe this phenomenon over the past 5-10 years as the data industry matures. The efficiency of people is vastly more important than the efficiency of servers for nearly every company. Not only do salaries tend to be a dramatically larger line item on the budget than servers, but nearly every data team is simultaneously under-staffed and facing an ever increasing backlog.
In fact, our 2021 DataAware Pulse Survey found that while only 21% of organizations still have issues with the scale of their data, a whopping 96% of data teams are at or over capacity. Which, when you think about it, makes sense. Engineering managers and teams have collectively spent years solving the infrastructure scale problem, so much so that it has become ingrained in their mindset. It’s time for a massive mindset shift, however. Teams are underwater, and it is only getting worse.
How to fix: it’s time to flip the priorities and start investing in team productivity. Initiatives around Data Automation and DataOps will greatly expand the output of your team. A favorite thought exercise of mine is to ask folks on the team: “if you had to double your output but couldn’t work a single additional hour, what would you do? More importantly, what would you need to stop doing?” It’s incredible how illuminating the answers to these questions are.
Leading a High Output Data Engineering Team
Falling into the trap of any of these mistakes is not the end of the road for the data engineering team. Step-by-step, engineering managers can start the process of regularly looking at the architecture with a critical eye to see what parts provide value and what parts have been commoditized and should be automated, moved to SaaS providers, or outright done away with to make room for higher-level priorities. By focusing on the outcomes, not the features, they’ll gain clarity on what ultimately matters long term. And by prioritizing their team’s time and productivity, they’ll ensure they can not only recruit and retain exceptional talent, but that they’ll drive significantly more impactful outcomes.