Ghost Engineers: What is Productivity?

Look! Yet another study about developer productivity! This time, it is a Twitter post made by a Stanford researcher who claims 9.5% of software engineers do almost nothing and are “Ghost Engineers.” In this post, I consider a couple of my own development anecdotes in light of this and the experience gained examining productivity within various teams over the years. The long and short of it: looking at code is not a valuable measure of individual productivity and individual productivity should not be what we are measuring anyway.

The twitter post in question points to a Stanford U page. The only research paper linked from this page does not appear to be this study the original Twitter poster is making claims about, but it is related. In follow up posts, the poster did back pedal, but the damage is done. At nearly the same time, AWS is making the social media news for the shocking news about how much time developers are spending not writing code - instead lamenting, among other things, that they spend time testing (AWS is trying to sell you AI tools to solve this problem of course). Non-development leaders see blurbs like those and conclude they need to crack the whip on their developers instead of taking a more holistic view.

Going back to my days working on systems at PWC, leaders have periodically engaged me to evaluate team performance. At PWC, it was as a manager in one group looking at another group. During my time at Capax Global(since merged into Hitachi Solutions) and West Monroe, these occasions came as part of a formal consulting engagement. In most cases, the leader knows something is awry, has a pretty darn good sense already what it is, and is looking for validation and making sure they’re not missing something.

So in that time, can I conclude that “ghost engineers” exist? Yup, they do. Pick a profession and there’s going to be people whose main skill is avoiding useful work and they will spend an awful lot of energy in the process. There will be others who want to do well, they just do not have the right skills. You might be able to help the second. The first? Good luck. This is not new or unique to the development profession. To quote Ecclesiastes 1:9 “there is no new thing under the sun.”

Can we find those “ghosts” by looking at work output? Let’s start by looking at two of my own coding stories.

A definition before we get too far for those who may not be development managers: when I say “commit” or “code commit” here, think of it as a developer calling some work done and sharing it with the rest of the team by sending it to a shared location.

An anecdote

I came home from the office one Friday night in 2004, ate dinner, and opened my laptop back up. In the course of about four hours, I built an entire database with hundreds of unique tables with all their foreign keys, initial indices, and all the basic stored procedures needed to save, retrieve, and delete records. Monday morning, everyone was astounded. We were planning on it taking several weeks of developer time with the UI and App tier devs needing to drag out other work until the database was ready, yet here we were all ready to go on to the next stage. Was it a miracle? No. Was I a mad genius? No.

The only solo work I did was create the scripts transform the work we did collectively.

My burst of productivity came from the month or so of white boarding with the team and creating a data catalog without being bogged down in minutiae of turning it into database schema as we went back and forth over the data model. Once I had that first data catalog in hand, all I needed to do was export it into another format, add the SQL Data Definition Language (DDL) syntax around field and table names we had in our catalog and I had my DDL scripts. If you looked at my commit that week, I was REALLY productive, but only because myself and a few other people hashed everything out and then we wrote it all down in a structured format. The only solo work I did was create the scripts transform the work we did collectively into DDL.

None of those other people would get any credit for that work if all we did was look at code committed in the form of the final DDL. We didn’t even have a way to track anyone doing a code review to give them credit for doing even that part.

Another anecdote

In 2016 a client of mine struggled with the performance of one particular bit of business logic that drove a key feature in their product. They knew it should be faster, several of their senior developers spent quite a bit of time trying to improve it, and they still struggled to identify the reason is ran slow.

My toolbox for performance analysis was a little bigger than theirs, so I got cracking. Performance analysis takes time. There will be many red herrings. There will be many obvious things that save a little time but not a lot of time. There will be frustration, burnout, and a feeling HR is coming for you. There is nothing fun when you are in the middle of fixing performance issues. The reward only comes at the end.

By the morning of Day Three, I realized the performance data collected didn’t add up. If code is a tree, when I added up the amount of time that all of the “leaves” took to run, it didn’t match, or even come close, to the time when I added up all the “branches” those leaves were attached to. It took to that third day to get my clue: the problem wasn’t an algorithm. It also wasn’t a missing database index or needing to regenerate an index.

The root cause? Entity Framework(EF)¹ being told to do too much. The code that queries the database returned an IQueryable object. The code that called that code did some manipulation (a loop) and passed IQueryable up to the code that called it. Add a few more layers of that and the core bit that orchestrated all of that business logic still worked with IQueryable objects. EF likes to try to defer execution of any database query that returns IQueryable as long as it can. It’s trying to understand the whole of what you are doing so that it does not need to make a lot of very small calls to the database. We call a lot of small calls being “chatty” and the back and forth can eat up a lot of time. EF assembles all those calls into a big query and most of the time does just fine. This time? It couldn’t figure out how to make an efficient query out of all those layers and ended up basically with nested code inside of nested code inside of nested code, etc. It was sending THIRTY pages of SQL commands to get one result back.

All I had to do to fix it was change one word.

My code commit to fix all this? Figuring out how to break that chain and tell EF to make a few calls instead of one. All I had to do to fix it was change one word. I changed IQueryable to IEnumerable in the right place to stop the deferred execution² and cut the command bloat down. A one word code commit that had everyone dancing for joy because a process that took dozens of seconds now took less than one.

So what is productivity anyway?

In both cases, I spent some time and provided value. In case one, I spent a little time on creating a lot of output precisely because I was the fingers put on the keyboard for a whole team. That then accelerated the whole team. If did it all by myself, it would have taken 4-5 months (more time than the 3 or us who put most of the time into the model) and the team would be blocked. In the second case, others spent time, but it took someone with better tools to get to the bottom of the problem and my “output” was pretty small: one word. No one cared it was one word, they cared that it worked.

The productivity that matters is the value provided, not volume produced. How much code I wrote in either case is trivia. If management measured my productivity by raw output, they are incentivizing me to not get work done. I don’t spend a Friday night finding a way to get the result soon — I would drag it out for several weeks to make my average look better. Later, incentives drive me to add a bunch of useless changes to code to make my one word change look bigger (and take more time to do it). In both cases, the outcome is worse for the team and the business.

As I mentioned above, on many occasions, a leader called on me to analyze team productivity. That involved looking at code commits, looking at bugs reported, looking at how much time someone is spending on development, and more. Interviews and discussions with developers, CIOs, CTOs, stakeholders, systems architects, team leads played a crucial part in understanding the context of both the system and the team. So, who gets the most done when we measure code output? In most healthy organizations, it’s the mid-senior developers. Conversely, the “professional ghosts” (meaning, those very few people who would rather do nothing) know people their work will eventually be looked at so they do the bare minimum to get by.

Who commits the least? The team leads and, often, the most skilled developers on the team. They’re helping everyone else! And they ought to.

Going fast instead of going well leads to … the team going slower

If the most productive person is a team lead, that is a sign of a problem (Rule 59). Except in small teams, leads and other senior-most developers spending most of their time writing code³ and focusing on volume produced is a sign that they’re not spending their time understanding the business context, understanding best design, making sure knowledge is spread around, and generally focusing on ways to make the team effective. There are often deep technical issues and organizational dysfunctions at work: a focus on “get it done now” instead of “get it done well” is common and often isn’t even coming from executive leadership. Going fast instead of going well leads to poor testing (Rule 14), mounting technical debt, and the team going slower and slower over time instead of faster and faster as the team grows its capabilities and the systems gets better.

Conclusion - follow the money

What incentives are you creating when you measure individual coding “productivity?” You are saying “don’t think” and “don’t research.” You are also encouraging “roll your own” instead of looking for existing libraries. Why reuse when you can reproduce? When you tell me that individual productivity matters, you are telling me that the only thing that matters is how good I look, not how well the team performs or the organization performs. The message I hear, to turn the classic phrase⁴ on its head, is that “you are paying me for the code, not the solution.”

The result is that I’m going to spend weeks developing a calendar dropdown control for the website instead of spending a day or two evaluating which of many off the shelf options (many free in one way or another) I should recommend that we use.

As professional technologists, it is the solution that matters, not the code. When team management becomes code counting, no matter how intelligent that code counting is, we send a very clear message that our developers are drones who should not ask questions, who should do instead of think, and who should expect to be laid off/downsized/RIFed at any moment. Is that the team you really want to work with?

Let your leaders manage the “ghosts” and focus on the team’s well being. You will be happier, your bosses will be happier, and the team will be happier even if there are a few hard discussions along the way.

Entity Framework is .NET’s standard data abstraction framework — not to be confused with .NET’s standard data access framework, ADO.NET, which is a lower level tool. My general feeling on Entity Framework is that it makes tedious (easy) work less tedious, but hard work harder. See my prior article Choosing Tools to Make Your Life Harder. I personally favor going to the School Zone on data access over the Specialist Sector. ↩
In this case, that change meant I didn’t need to update unit tests, changing algorithms around, or anything else. The structure of the code didn’t change, just what data it actually had when. As I understand, they went back and did more cleanup later, but this got that work unblocked. ↩
In light of the AWS and other AI marketing, I really do mean the time putting fingers to keyboard to write code. Not the time spent understanding, testing, or documenting anything else that really is part of software development. ↩
The classic phrasing is some variation of “you are paying me for the solution, not the code.” ↩