I’ve written previously on assumptions and how they affect software development. Taking this as a foundation, the value proposition of unit testing becomes much more apparent: it offers a subjective reassurance that certain assumptions are valid. By mechanically testing components for correctness, you’re allowing yourself the freedom to safely assume that code which passes its tests is highly unlikely to be the cause of an issue, so long as there is a test in place for the behavior you’re using.
Category Archives: development
Teaching a Developer to Fish
I write a lot about development philosophy here, and very little about technique. There are reasons for this, and I’d like to explain.
In my experience, often what separates an easy problem from an intractable one is method and mindset. How you approach a problem tends to be more important than the implementation you end up devising to solve it.
Let’s say you’re given the task of designing a recommendation engine – people like you were interested in X, Y, and Z. Clearly this is an algorithmic problem, and a relatively difficult one at that. How do you solve it?
The algorithm itself isn’t significant; as a developer, the algorithm is your output. The process you use to achieve the desired output is what determines how successful you’ll be. I could talk about an algorithm I wrote, but that’s giving a man a fish. I’d much rather teach a man to fish.
So how do you fish, as it were, for the perfect algorithm? You follow solid practices, you iterate, and you measure. That means you start with a couple of prototypes, you measure the results, you whittle down the candidate solutions until you have a best candidate, and then you refine it until it’s as good as it can get. Then you deploy it to production, you continue to measure, and you continue to refine it. If you code well, you can A/B test multiple potential algorithms, in production, and compare the results.
How do you fish for a fix to a defect? You follow solid practices, you iterate, and you measure. You start by visual inspection, checking for code quality, and doing light refactoring to try to simplify the code and eliminate points of failure, to narrow down the possibilities. Often this alone will bring the root cause of the defect to the surface quickly, or even solve it outright. If it doesn’t, you add logging, and you watch the results as you recreate the error, trying to recreate it in different ways, to assess the boundaries of the defect; if this is for an edge case, what exactly defines the “edge” that’s affected? What happens during each step of execution when it happens? Which code is executing and which code isn’t? What parameters are being passed around?
In my experience, logging tends to be a far more effective debugging tool than a step-wise debugger in most cases, and with a strong logging framework, you can leave your logging statements in place with negligible performance impact in production (with debug logging disabled), and with fine-grained controls to allow you to turn up verbosity for the code you’re inspecting without turning all logging on and destroying the signal-to-noise ratio of your logging output.
You follow solid practices, you iterate, and you measure. If you use right process, with the right mindset, you’ll wind up at the right solution.
That’s why I tend to wax philosophical instead of writing about concrete solutions I’ve implemented. Chances are I wrote the solution to my problem, not your problem; and besides, I’d much rather teach a man to fish than give a man a fish.
Code Patterns as Microevolution
The first step to taming a code base is to realize the importance of assessing fitness and taking control over what patterns are permitted or encouraged to continue to reproduce. Code reviews are your opportunity to thin the herd, to cull the weak, and allow the strong to flourish.
If you think about it, the “broken windows” problem could also be viewed as bad genes being allowed to perpetuate. As the bad patterns continue to reproduce, their number grows, and so does their impact on the overall gene pool of your code. Given the opportunity, you want to do everything you can to make sure that it’s the good code that’s continuing to live on, not the bad.
Consider a new developer joining your project. A new developer will look to existing code as an example to learn from, and as a template for their own work on the project, perpetuating the “genes” already established. That being the case, it seems imperative that you make sure those genes are good ones.
They will also bring their own ideas and perspectives to the process, establishing new patterns and mutating existing ones, bringing new blood into the gene pool. This sort of cross-breeding is tremendously helpful to the overall health of the “code population” – but only if the new blood is healthy, which is why strong hiring practices are so critical.
Building a Foundation
This is precisely what you should be aiming for: take the time to build a foundation that will turn complex tasks into trivial ones as you progress. This is the main purpose behind design concepts like the single responsibility principle, the Hollywood principle, encapsulation, DRY, and so on.
This isn’t to be confused with big upfront design; in face, it’s especially important to keep these concepts in mind in an agile process, where you’re building the architecture as you go. It can be tempting to just hack together what you need at the moment. That’s exactly what you should be doing for a prototype, but not for real development. For lasting functionality, you should assemble a foundation to support the functionality you’re adding now, and similar functionality in the future.
It can be difficult to balance this against YAGNI – you don’t want to build what you don’t need, but you want to build what you do need in such a way that it will be reusable. You want to save yourself time in the future, without wasting time now.
To achieve a perfect balance would require an extraordinary fortune teller, of course. Experience will help you get better at determining what foundation will be helpful, though. The more experience you have and the more projects you work on, the better sense you’ll have of what can be done now to help out future you.
My Take on "Collective Ownership"/"Everyone is an Architect"
I love the idea of “collective ownership” in a development project. I love the idea that in a development team, “everyone is an architect”. My problem is with the cut-and-dried “Agile” definition of these concepts.
The Importance of Logging
- Block out the overall process, step by step, in comments.
- Any complex step (more than five or ten lines of code), replace the comment with a clearly-named method or function call, and create a stub method/function.
- Replace comments with equivalent logging statements.
- Implement functionality.
- Give all functions, methods, classes, parameters, properties, and variables clear, concise names, so that the code ends up in some semblance of readable English.
- Use thorough sanity checking, by means of assertions or simple if blocks. When using if blocks, include logging for any failed checks, including what was expected and what was found. These should be warnings.
- Include logging in any error/exception handling code. These should be errors if recoverable, or fatal if not. This is all too often the only logging a developer includes!
The Problem with Responsive Design
A huge problem I see with responsive/adaptive design today is that, all too often, it treats “small viewport” and “mobile” as being synonymous, when the two concepts are orthogonal. A mobile device can have a high-resolution display, just as a desktop user can have a small display, or just a small browser window.
Responsive designs need to design for viewport size, and nothing more. It’s not mobile, it’s a small display. Repeat that to yourself about a thousand times.
Sanity Checks: Assumptions and Expectations
Assertions and unit tests are all well and good, but they’re too narrow-minded in my eyes. Unit tests are great for, well, testing small units of code to ensure they meet the basic requirements of a software contract – maybe a couple of typical cases, a couple of edge cases, and then additional cases as bugs arise and new test cases are created for them. No matter how many cases you create, however, you’ll never have a test case for every possible scenario.
Assertions are excellent for testing in-situ; you can ensure that unacceptable values aren’t given to or by a piece of code, even in production (though there is a performance penalty to enabling assertions in production, of course.) I think assertions are excellent, but not specific enough: any assertion that fails is automatically a fatal error, which is great, unless it’s not really a fatal error.
That’s where the concept of assumptions and expectations come in. What assertions and unit tests really do is test assumptions and expectations. A unit test says “does this code behave correctly when given this data, all assumptions considered?” An assertion says “this code assumes this thing, and will not behave correctly if it gets another, so throw an error.”
When documenting an API, it’s important to document assumptions and expectations, so users of the API know how to work with your code. Before I go any further, let me define what I mean by these very similar terms: to me, code that assumes something operates as if its assumptions are correct, and will likely fail if its assumptions turn out to be incorrect. Code that expects something operates as if its expectations are met, but will likely still operate correctly even if they aren’t. It’s not guaranteed to work, or guaranteed to fail; it’s likely to work, but someone should probably know about it and look into it.
Therein lies the rub: these are basically two types of assertions, one fatal, one not. What we need is an assertion framework that allows for warning-level assertion failures. What’s more, we need an assertion framework that is performant enough to be regularly enabled in production.
So, any code that’s happily humming along in production, that says:
will fail immediately if percentage is outside those bounds. It’s assuming that percentage is between zero or one hundred, and if it assumes wrong, it will likely fail. Since it’s always better to fail fast, any case where percentage is outside that range should trigger a fatal error – preferably even if it’s running in production.
On the other hand, code that says:
will trigger a warning if numRows is over a thousand. It expects numRows to be under a thousand; if it isn’t, it can still complete correctly, but it may take longer than normal, or use more memory than normal, or it may simply be that if it got more rows than that, something may be amiss with the query that got the rows or the dataset the rows came from originally. It’s not a critical failure, but it’s cause for investigation.
Any assumption or expectation that fails should of course be automatically and immediately reported to the development team for investigation. Naturally a failed assumption, being fatal, should take priority over a failed expectation, which is recoverable.
This not only provides greater flexibility than a simple assertion framework, it also provides more explicit self-documenting code.
Be Maxwell’s Demon
Source code tends to follow the second law of thermodynamics, with some small differences. In software, as in thermodynamics, systems tend toward entropy: as you continue to develop an application, the source will increase in complexity. In software, as well as in thermodynamics, connected systems tend toward equilibrium: in development, this is known as the “broken windows” theory, and is generally considered to mean that bad code begets bad code. People often discount the fact that good code also begets good code, but this effect is often hidden by the fact that the overall system, as mentioned earlier, tends toward entropy. That means that the effect of broken windows is magnified, and the effect of good examples is diminished.
In thermodynamics, Maxwell’s Demon thought experiment is, in reality, impossible – it is purely a thought experiment. However, in software development, we’re in luck: any developer can play the demon, and should, at every available opportunity.
Maxwell’s demon stands between two connected systems, defeating the second law of thermodynamics by selectively allowing less-energetic particles through only in one direction, and more-energetic particles through only in the other direction, causing the two systems to tend toward opposite ends of the spectrum, rather than naturally tending toward entropy.
By doing peer reviews, you’re doing exactly that; you’re reducing the natural entropy in the system and preventing it from reaching its natural equilibrium by only letting the good code through, and keeping the bad code out. Over time, rather than tending toward a system where all code is average, you tend toward a system where all code is at the lowest end of the entropic spectrum.
Refactoring serves a similar, but more active role; rather than simply “only letting the good code through”, you’re actively seeking out the worse code and bringing it to a level that makes it acceptable to the demon. In effect, you’re reducing the overall entropy of the system.
If you combine these two effects, you can achieve clean, efficient, effective source. If your review process only allows code through that is as good or better than the average, and your refactoring process is constantly improving the average, then your final code will, over time, tend toward excellence.
Without a demon, any project will be on a continuous slide toward greater and greater entropy. If you’re on a development project, and it doesn’t have a demon, it needs one. Why not you?
Wait, wait, back up. When I think of a sprint, I think short and fast. That’s what sprinting means. You can’t sprint for a month straight; you’ll die. That’s a marathon, not a sprint.
There are numerous coding competitions out there. Generally, you get around 48 hours, give or take, to build an entire, working, functional game or application. Think about that. You get two days to build a complete piece of software from scratch. Now that’s what I call sprinting.
Of course, a 48 hour push is a lot to ask for on a regular basis; sure, your application isn’t in a competition, this is the real world, and you need to get real work done on an ongoing basis. You can’t expect your developers to camp out in sleeping bags under their desks. But that doesn’t mean turning a sprint into a marathon.
The key is instilling urgency, while moderating burnout. This is entirely achievable, and can even make development more fun and engaging for the whole team.Since the term sprint has already been thoroughly corrupted, I’ll use the term “dash”. Consider this weekly schedule:
- Monday: Demo last week’s accomplishments for stakeholders, and plan this week’s dash. This is a good week to schedule any unavoidable meetings.
- Tuesday and Wednesday: your 48 hours to get it done and working. These are crunch days, and they will probably be pretty exhausting. These don’t need to be 18-hour days, but 10 hours wouldn’t be unreasonable. Let people get in the zone and stay there as long as they can.
- Thursday: Refactoring and peer reviews. After a run, athletes don’t just take a seat and rest; they slow to a jog, then a walk. They stretch. The cool off slowly. Developers, as mental athletes, should do the same.
- Friday: Testing. QA goes through the application with a fine-toothed comb. The developers are browsing the web, playing games, reading books, propping their feet up, and generally being lazy bums, with one exception: they’re available at a moment’s notice if a QA has any questions or finds any issues. Friday is a good day for your development book club to meet.
- By the end of the week, your application should be ready again for Monday’s demo, and by Tuesday, everyone should be well-rested and ready for the next dash.
Think about it, though. Developers aren’t factory workers; they can’t churn out X lines of code per hour, 40 hours per week. That’s not how it works. A really talented developer might achieve 5 or 6 truly productive hours per day, but at that rate, they’ll rapidly burn out. 4 hours a day might be sustainable for longer. Now, mind you, in those four hours a day, they’ll get more done, better, with fewer defects, than an army of incompetent developers could do in a whole week. But the point stands: you can’t run your brain at maximum capacity eight hours straight, five days a week. You just can’t – not for long, anyway.
The solution is to plan to push yourself, and to plan to relax, and to keep the cycle going to maximize the effectiveness of those productive hours. It’s also crucial not to discount refactoring as not being productive; it sets up the following weeks’ work, and reduces the effort required to get the rest of the development done for the rest of the life of the application. It’s a critical investment in the future.
Spending a third of your development time on refactoring may seem excessive, and if it were that simple, I’d agree. But if you really push yourself for two days, you can get a lot done – and write a lot of code to be reviewed and refactored. In one day of refactoring, you can learn a lot, get important work done, and still start to cool off from the big dash.
That lazy Friday really lets you relax, improve your craft, and get your product ready for next week, when you get to do it all over again.