Engineers, Hours, Perks, and Pride

This started as a Google+ post about an article on getting top engineering talent and got way too long, so I’m posting here instead.

I wholeheartedly agree that 18-hour days are just not sustainable. It might work for a brand-new startup cranking out an initial release, understaffed and desperate to be first to market. But, at that stage, you can expect the kind of passion and dedication from a small team to put in those hours and give up their lives to build something new.

Once you’ve built it, though, the hours become an issue, and the playpen becomes a nuisance. You can’t expect people to work 18-hour days forever, or even 12-hour days. People far smarter than I have posited that the most productive time an intellectual worker can put in on a regular basis is 4 to 6 hours per day; after that, productivity and effectiveness plummet, and it only gets worse the longer it goes on.

Foosball isn’t a magical sigil protecting engineers from burn-out. Paintball with your coworkers isn’t a substitute for drinks with your friends or a night in with your family. An in-house chef sounds great on paper, until you realize that the only reason they’d need to provide breakfast and dinner is if you’re expected to be there basically every waking moment of your day.

Burn-out isn’t the only concern, either. Engineering is both an art and a science, and like any art, it requires inspiration. Inspiration, in turn, requires experience. The same experience, day-in, day-out – interacting with the same people, in the same place, doing the same things – leaves one’s mind stale, devoid of inspiration. Developers get tunnel-vision, and stop bringing new ideas to the table, because they have no source for them. Thinking outside the box isn’t possible if you haven’t been outside the box in months.

Give your people free coffee. Give them lunch. Give them great benefits. Pay them well. Treat them with dignity and respect. Let them go home and have lives. Let them get the most out of their day, both at work and at home. You’ll keep people longer, and those people will be more productive while they’re there. And you’ll attract more mature engineers, who are more likely to stick around rather than hopping to the next hip startup as soon as the mood strikes them.

There’s a certain pride in being up until sunrise cranking out code. There’s a certain macho attitude, a one-upmanship, a competition to see who worked the longest and got the least sleep and still came back the next morning. I worked from 8am until 4am yesterday, and I’m here, see how tough I am? It’s the geek’s equivalent to fitness nuts talking about their morning 10-mile run. The human ego balloons when given the opportunity to endure self-inflicted tortures.

But I’m inclined to prefer an engineer who takes pride in the output, not the struggle to achieve it. I want someone who is stoked that they achieved so much progress, and still left the office at four in the afternoon. Are they slackers, compared to the guy who stayed another twelve hours, glued to his desk? Not if the output is there. It’s the product that matters, and if the product is good, and gets done in time, then I’d rather have the engineer that can get it done without killing themselves in the process.

“I did this really cool thing! I had to work late into the night, but caffeine is all I need to keep me going. I kept having to put in hacks and work-arounds, but the important thing is that it’s done and it works. I’m a coding GOD!” Your typical young, proud engineer. They’re proud of the battle, not the victory; they’re proud of how difficult it was.

“I did this really cool thing! Because I had set myself up well with past code, it was a breeze. I was amazed how little time it took. I’m a coding GOD!” That’s my kind of developer. That’s pride I can agree with. They’re proud because of how easy it was.

This might sound like an unfair comparison at first, but think about it. When you’re on a 20-hour coding bender, you aren’t writing your best code. You’re frantically trying to write working code, because you’re trying to get it done as fast as you can. Every cut corner, every hack, every workaround makes the next task take that much longer. Long hours breed technical debt, and technical debt slows development, and slow development demands longer hours. It’s a vicious cycle that can be extremely difficult to escape, especially once it’s been institutionalized and turned into a badge of honor.

My Personal Project Workflow/Toolset

I do a lot of side projects, and my personal workflow and tooling is something that’s constantly evolving. Right now, it looks something like this:

  • Prognosticator for tracking features/improvements, measuring the iceberg, and tracking progress
  • WorkFlowy for tracking non-development tasks (the most recent addition to the toolset)
  • Trac for project documentation, and theoretically for defect tracking, though I’ve not been good about entering defects in Trac recently; it doesn’t seem worth the effort on a one-person project, though with multiple people I think it would be a must
  • Trello for cross-cutting all the above and indicating what’s next/in progress/recently completed, and for quickly jotting down ideas/defects. Most of the defect tracking actually goes in here on one-man projects right now. This is a lot of duplication and the main source of waste in my current process.
  • Bitbucket for source control (I also use Atlassian’s excellent SourceTree as a Git/Hg client.)
It’s been working well for me, the only issue I have is duplication between the tools, and failing to consistently use Trac for defect tracking. What keeps me in Trello is how quick and easy it is to add items to it, and the fact that I’m using it as a catch-all – I can put a defect or an idea or a task into it in a couple of seconds; I just have to replicate it to the appropriate place later, which is the problem.
I think the issue boils down to being torn between having a centralized repository for “stuff to be done” (Trello) and having dedicated repositories catered to each type of thing to be done (Prognosticator, Trac, and WorkFlowy); and convenience. Trello is excellent for jotting something down quickly, but lacks the additional specific utility of the other tools for specific purposes.
I think what I’ll end up doing is creating a “whiteboard” list in WorkFlowy, and using that instead of Trello to jot down quick notes when I don’t have the time to use the individual tools; then I can copy from there to the other tools when I need to. That will allow me to cut Trello down to basically being a Kanban board.

Pragmatic Prioritization

The typical release scheduling process works something like this:

  1. Stakeholders build a backlog of features they’d like to see in the product eventually.
  2. The stakeholders decide among themselves the relative priority of the features in the backlog.
  3. The development team estimates the development time for each feature.
  4. The stakeholders set a target feature list and ship date based on the priorities and estimates.
The problem here is primarily in step 2; this step tends to involve a lot of discussion bordering on arguing bordering on in-fighting. Priorities are set at best based on a sense of relative importance, at worst based on emotional attachment. Business value is a vague and nebulous consideration at most.
I propose a new way of looking at feature priorities:

  1. Stakeholders build a backlog of features they’d like to see in the product eventually.
  2. The stakeholders estimate the business value of each feature in the backlog.
  3. The development team estimates the development time for each feature.
  4. The stakeholders set a target feature list and ship date based on the projected return of each feature – i.e., the estimated business value divided by the estimated development time.
This turns a subjective assessment of relative priorities into an objective estimate of business value, which is used to determine a projected return on investment for each feature. This can then be used to objectively prioritize features and schedule releases.
I’ve been using this workflow recently for one of my upcoming projects, and I feel like it’s helped me to more objectively determine feature priorities, and takes a lot of the fuzziness and hand-waving out of the equation.

Shameless self-promotion: Pragmatic prioritization is a feature of my project scheduling and estimation tool, Rogue Prognosticator

Assumptions and Unit Tests

I’ve written previously on assumptions and how they affect software development. Taking this as a foundation, the value proposition of unit testing becomes much more apparent: it offers a subjective reassurance that certain assumptions are valid. By mechanically testing components for correctness, you’re allowing yourself the freedom to safely assume that code which passes its tests is highly unlikely to be the cause of an issue, so long as there is a test in place for the behavior you’re using.

This can be a double-edged sword: it’s important to remember that a passing test is not a guarantee. Tests are written by developers, and developers are fallible. Test cases may not exercise the behavior in precisely the same way as the code you’re troubleshooting. Test cases may even be missing for the particular scenario you’re operating under.
By offering a solid foundation of trustworthy assumptions, along with empirical proof as to their validity, you can eliminate many possible points of failure while troubleshooting, allowing you to focus on what remains. You must still take steps to verify that you do have test coverage for the situation you’re looking at, in order to have confidence in the test results. If you find missing coverage, you can add a test to fill the gap; this will either pass, eliminating another possible point of failure, or it will fail, indicating a probable source of the issue.
Just don’t take unit test results as gospel; tests must be maintained just like any other code, and just like any other code, they’re capable of containing errors and oversights. Trust the results, but not the tests, and learn the difference between the two: results will reliably tell you whether the test you’ve written passed or failed. It is the test, however, that executes the code and judges passing or failing. The execution may not cover everything you need, and the judgement may be incorrect, or not checking all factors of the result.

Teaching a Developer to Fish

I write a lot about development philosophy here, and very little about technique. There are reasons for this, and I’d like to explain.

In my experience, often what separates an easy problem from an intractable one is method and mindset. How you approach a problem tends to be more important than the implementation you end up devising to solve it.

Let’s say you’re given the task of designing a recommendation engine – people like you were interested in X, Y, and Z. Clearly this is an algorithmic problem, and a relatively difficult one at that. How do you solve it? 

The algorithm itself isn’t significant; as a developer, the algorithm is your output. The process you use to achieve the desired output is what determines how successful you’ll be. I could talk about an algorithm I wrote, but that’s giving a man a fish. I’d much rather teach a man to fish.

So how do you fish, as it were, for the perfect algorithm? You follow solid practices, you iterate, and you measure. That means you start with a couple of prototypes, you measure the results, you whittle down the candidate solutions until you have a best candidate, and then you refine it until it’s as good as it can get. Then you deploy it to production, you continue to measure, and you continue to refine it. If you code well, you can A/B test multiple potential algorithms, in production, and compare the results.

How do you fish for a fix to a defect? You follow solid practices, you iterate, and you measure. You start by visual inspection, checking for code quality, and doing light refactoring to try to simplify the code and eliminate points of failure, to narrow down the possibilities. Often this alone will bring the root cause of the defect to the surface quickly, or even solve it outright. If it doesn’t, you add logging, and you watch the results as you recreate the error, trying to recreate it in different ways, to assess the boundaries of the defect; if this is for an edge case, what exactly defines the “edge” that’s affected? What happens during each step of execution when it happens? Which code is executing and which code isn’t? What parameters are being passed around?

In my experience, logging tends to be a far more effective debugging tool than a step-wise debugger in most cases, and with a strong logging framework, you can leave your logging statements in place with negligible performance impact in production (with debug logging disabled), and with fine-grained controls to allow you to turn up verbosity for the code you’re inspecting without turning all logging on and destroying the signal-to-noise ratio of your logging output.

You follow solid practices, you iterate, and you measure. If you use right process, with the right mindset, you’ll wind up at the right solution.

That’s why I tend to wax philosophical instead of writing about concrete solutions I’ve implemented. Chances are I wrote the solution to my problem, not your problem; and besides, I’d much rather teach a man to fish than give a man a fish.

Code Patterns as Microevolution

Code patterns abide by survival of the fittest, within a gene pool of the code base. Patterns reproduce through repetition, sometimes with small mutations along the way. Patterns can even mate, after a fashion, by combining them, taking elements of each to form a new whole. This is the natural evolution of source code.

The first step to taming a code base is to realize the importance of assessing fitness and taking control over what patterns are permitted or encouraged to continue to reproduce. Code reviews are your opportunity to thin the herd, to cull the weak, and allow the strong to flourish.


Team meetings, internal discussions, training sessions, and learning investments are then your opportunity to improve both the quality of new patterns and mutations that emerge, as well as the group’s ability to effectively manage the evolution of your source, to correctly identify the weak and the strong, and to have a lasting impact on the overall quality of the product.

If you think about it, the “broken windows” problem could also be viewed as bad genes being allowed to perpetuate. As the bad patterns continue to reproduce, their number grows, and so does their impact on the overall gene pool of your code. Given the opportunity, you want to do everything you can to make sure that it’s the good code that’s continuing to live on, not the bad.

Consider a new developer joining your project. A new developer will look to existing code as an example to learn from, and as a template for their own work on the project, perpetuating the “genes” already established. That being the case, it seems imperative that you make sure those genes are good ones.

They will also bring their own ideas and perspectives to the process, establishing new patterns and mutating existing ones, bringing new blood into the gene pool. This sort of cross-breeding is tremendously helpful to the overall health of the “code population” – but only if the new blood is healthy, which is why strong hiring practices are so critical.

Building a Foundation

It’s been said that pharmaceutical companies produce drugs for pennies per pill – except the first pill, which costs millions. Things aren’t so different in the land of software development: the first usage of some new functionality might take hours, building the foundation and related pieces. But it could be re-used a hundred times trivially, and usually expanded or modified with little effort as well (assuming it was well-written to start with).

This is precisely what you should be aiming for: take the time to build a foundation that will turn complex tasks into trivial ones as you progress. This is the main purpose behind design concepts like the single responsibility principle, the Hollywood principle, encapsulation, DRY, and so on.

This isn’t to be confused with big upfront design; in face, it’s especially important to keep these concepts in mind in an agile process, where you’re building the architecture as you go. It can be tempting to just hack together what you need at the moment. That’s exactly what you should be doing for a prototype, but not for real development. For lasting functionality, you should assemble a foundation to support the functionality you’re adding now, and similar functionality in the future.

It can be difficult to balance this against YAGNI – you don’t want to build what you don’t need, but you want to build what you do need in such a way that it will be reusable. You want to save yourself time in the future, without wasting time now.

To achieve a perfect balance would require an extraordinary fortune teller, of course. Experience will help you get better at determining what foundation will be helpful, though. The more experience you have and the more projects you work on, the better sense you’ll have of what can be done now to help out future you.