My Take on "Collective Ownership"/"Everyone is an Architect"

I love the idea of “collective ownership” in a development project. I love the idea that in a development team, “everyone is an architect”. My problem is with the cut-and-dried “Agile” definition of these concepts.

What I’ve been reading lately is a definition of “collective ownership” that revolves around the idea of distributing responsibility, primarily in order to lift the focus on finger-pointing and blaming. A defect isn’t “your fault”, it’s “our fault”, and “we need to fix it.” That’s all well and good, but distributing blame isn’t exactly distributing ownership; and ignoring the source of an issue is a blatant mistake.
The latter point first: identifying the source of an issue is important. I see no need for blame, or calling people out, and certainly no point in trying to use defects as a hard metric in performance analysis. However, a development team isn’t a factory; it’s a group of individuals who are constantly continuing their education, and honing their craft, and in thatĀ endeavorĀ  they need the help of their peers and managers to identify their weaknesses so they know what to focus on. “Finding the source of an issue” isn’t about placing blame or reprimanding someone, it’s about providing a learning opportunity so that a team member can improve, and the team as a whole can improve through the continuing education of each member.
In regard to distributing ownership, it’s all too rare to see discussion of distributing ownership in a positive way. I see plenty of people writing about eliminating blame, but very few speaking of a team wherein every member looks at the entire code base and says “I wrote that.” And why should they? They didn’t write it alone, so they can’t make that claim. For the product, they can say “I had a hand in that,” surely. But it’s unlikely they feel like they had a hand in the development of every component.
That brings us around to the idea that “everyone is an architect.” In the Agile sense, this is generally taken to mean that every developer is given relatively free rein to architect the component they’re working on at any given moment, without bowing down to The Architect for their product. I like this idea, in a sense – I’m all for every developer doing their own prototyping, their own architecture, learning their own lessons, and writing their own code. Up to a point.
There is a level of architecture that it is necessary for the entire team to agree on. This is where many teams, even Agile teams, tend to fall back on The Architect to keep track of The Big Picture and ensure that All The Pieces Fit Together. This is clearly the opposite of “everyone is an architect”. So where’s the middle ground?
If a project requires some level of architecture that everyone has to agree on – language, platform, database, ORM, package structure, whatever applies to a given situation – then the only way to have everyone be an architect is design by committee. Panning design by committee has become a cliche at this point, but it has its uses, and I feel this is one of them.
In order to achieve collective ownership, you must have everyone be an architect. In order for everyone to be an architect, and feel like they gave their input into The Product as a whole – or at least had the opportunity to do so – you must make architectural decisions into group discussions. People won’t always agree, and that’s where the project manager comes in; as a not-an-architect, they should have no bias and no vested interest in what choices are made, only that some decision is made on each issue that requires consideration. Their only job in architectural discussions is to help the group reach a consensus or, barring that, a firm decision.
This is where things too often break down. A senior developer or two, or maybe a project manager with development experience, become de facto architects. They make the calls and pass down their decrees, and quickly everyone learns that if they have an architecture question, they shouldn’t try to make their own decision, they shouldn’t pose it to the group in a meeting, they should ask The Guy, the architect-pro-tem. Stand-up meetings turn into a doldrum of pointless status updates, and discussion of architecture is left out entirely.
Luckily, every team member can change this. Rather than asking The Guy when a key decision comes up, ask The Group. Even better, throw together a prototype, get some research together, and bring some options with pros and cons to the next stand-up meeting. Every developer can do their part to keep the team involved in architecture, and in ownership, and to slowly shift the culture from having The Architect to having Everyone Is An Architect.

The Importance of Logging

Add more logging. I’m serious.
Logging is what separates an impossible bug report from an easy one. Logging lets you replace comments with functionality. I’d even go so far as to say good logging separates good developers from great ones.
Try this: replace your inline comments with equivalent logging statements. Run your program and tail the log file. Suddenly, you don’t need a step wise debugger for the vast majority of situations, because you can see, in the log, exactly what the program is doing, what execution path it’s taking, where in the source where each logging statement is coming from, and where execution stopped in the event of a crash.
My general development process focuses on clean, readable, maintainable, refactorable, self-documenting code. The process is roughly like this:
  1. Block out the overall process, step by step, in comments.
  2. Any complex step (more than five or ten lines of code), replace the comment with a clearly-named method or function call, and create a stub method/function.
  3. Replace comments with equivalent logging statements.
  4. Implement functionality.
    • Give all functions, methods, classes, parameters, properties, and variables clear, concise names, so that the code ends up in some semblance of readable English.
    • Use thorough sanity checking, by means of assertions or simple if blocks. When using if blocks, include logging for any failed checks, including what was expected and what was found. These should be warnings.
    • Include logging in any error/exception handling code. These should be errors if recoverable, or fatal if not. This is all too often the only logging a developer includes!
  5. Replace inline comments with equivalent logging statements. These should be debug or info/trace level; major section starts should be higher level, while mid-process statements should be lower level.
  6. Add logging statements to the start of each method/function. These should also be debug or info/trace level. Use higher-level logging statements for higher-level procedures, and lower-level logging statements for more deeply-nested calls.
  7. For long-running or resource-intensive processes, particularly long loops, add logging statements at regular intervals to provide progress and resource utilization details.
Make good use of logging levels! Production systems should only output warnings and higher by default, but it should always be possible to enable deeper logging in order to troubleshoot any issues that arise. However, keep the defaults in mind, and ensure that any logging you have in place to catch defects will provide enough information in the production logs to at least begin an investigation.
Your logging messages should be crafted with dual purpose in mind: first, to provide useful, meaningful outputs to the log files during execution (obviously), but also to provide useful, meaningful information to a developer reading the source – i.e., the same purpose served by comments. After a short time with this method you’ll find it’s very easy to craft a message that serves both purposes well.
Good logging is especially useful in an agile environment employing fast iteration and/or continuous integration. It may not be obvious why at first, but all the advantages of good logging (self-documenting code, ease of maintenance, transparency in execution) do a lot to facilitate agile development by making code easier to work with and easier to troubleshoot.
But wait, there’s more! Good logging also makes it a lot easier for new developers to get up to speed on a project. Instead of slogging through code, developers can execute the program with full logging, and see exactly how it runs. They can then review the source code, using the logging statements as waypoints, to see exactly how the code relates to the execution.

If you need a tool for tailing log files, allow me a shameless plug: try out my free log monitor, Rogue Informant. It’s been in development for several years now, it’s stable, it’s cross-platform, and it’s completely free to use privately or commercially. It allows you to monitor multiple logs at once, filter and search logs, and float a log monitoring window on top of other applications, to make it easier to watch the log while using the program to see exactly what’s going on behind the scenes.Give it a try, and if you find any issues or have feature suggestions, feel free to let me know!

The Problem with Responsive Design

A huge problem I see with responsive/adaptive design today is that, all too often, it treats “small viewport” and “mobile” as being synonymous, when the two concepts are orthogonal. A mobile device can have a high-resolution display, just as a desktop user can have a small display, or just a small browser window.

Responsive designs need to design for viewport size, and nothing more. It’s not mobile, it’s a small display. Repeat that to yourself about a thousand times.

What’s holding back single-design philosophies isn’t display size, it’s user interface; for decades, web designers have counted on there being a mouse cursor to generate events – mouseovers, clicks, drags. That’s not how it works on touchscreen devices, and we need some facility – JavaScript checks, CSS media queries – to cater to touch-based devices as opposed to cursor-based devices.

Sanity Checks: Assumptions and Expectations

Assertions and unit tests are all well and good, but they’re too narrow-minded in my eyes. Unit tests are great for, well, testing small units of code to ensure they meet the basic requirements of a software contract – maybe a couple of typical cases, a couple of edge cases, and then additional cases as bugs arise and new test cases are created for them. No matter how many cases you create, however, you’ll never have a test case for every possible scenario.

Assertions are excellent for testing in-situ; you can ensure that unacceptable values aren’t given to or by a piece of code, even in production (though there is a performance penalty to enabling assertions in production, of course.) I think assertions are excellent, but not specific enough: any assertion that fails is automatically a fatal error, which is great, unless it’s not really a fatal error.

That’s where the concept of assumptions and expectations come in. What assertions and unit tests really do is test assumptions and expectations. A unit test says “does this code behave correctly when given this data, all assumptions considered?” An assertion says “this code assumes this thing, and will not behave correctly if it gets another, so throw an error.”

When documenting an API, it’s important to document assumptions and expectations, so users of the API know how to work with your code. Before I go any further, let me define what I mean by these very similar terms: to me, code that assumes something operates as if its assumptions are correct, and will likely fail if its assumptions turn out to be incorrect. Code that expects something operates as if its expectations are met, but will likely still operate correctly even if they aren’t. It’s not guaranteed to work, or guaranteed to fail; it’s likely to work, but someone should probably know about it and look into it.

Therein lies the rub: these are basically two types of assertions, one fatal, one not. What we need is an assertion framework that allows for warning-level assertion failures. What’s more, we need an assertion framework that is performant enough to be regularly enabled in production.

So, any code that’s happily humming along in production, that says:

Assume.that(percentage).isBetween(0,100);

will fail immediately if percentage is outside those bounds. It’s assuming that percentage is between zero or one hundred, and if it assumes wrong, it will likely fail. Since it’s always better to fail fast, any case where percentage is outside that range should trigger a fatal error – preferably even if it’s running in production.

On the other hand, code that says:

Expect.that(numRows).isLessThan(1000);

will trigger a warning if numRows is over a thousand. It expects numRows to be under a thousand; if it isn’t, it can still complete correctly, but it may take longer than normal, or use more memory than normal, or it may simply be that if it got more rows than that, something may be amiss with the query that got the rows or the dataset the rows came from originally. It’s not a critical failure, but it’s cause for investigation.

Any assumption or expectation that fails should of course be automatically and immediately reported to the development team for investigation. Naturally a failed assumption, being fatal, should take priority over a failed expectation, which is recoverable.

This not only provides greater flexibility than a simple assertion framework, it also provides more explicit self-documenting code.

Be Maxwell’s Demon

Source code tends to follow the second law of thermodynamics, with some small differences. In software, as in thermodynamics, systems tend toward entropy: as you continue to develop an application, the source will increase in complexity. In software, as well as in thermodynamics, connected systems tend toward equilibrium: in development, this is known as the “broken windows” theory, and is generally considered to mean that bad code begets bad code. People often discount the fact that good code also begets good code, but this effect is often hidden by the fact that the overall system, as mentioned earlier, tends toward entropy. That means that the effect of broken windows is magnified, and the effect of good examples is diminished.

In thermodynamics, Maxwell’s Demon thought experiment is, in reality, impossible – it is purely a thought experiment. However, in software development, we’re in luck: any developer can play the demon, and should, at every available opportunity.

Maxwell’s demon stands between two connected systems, defeating the second law of thermodynamics by selectively allowing less-energetic particles through only in one direction, and more-energetic particles through only in the other direction, causing the two systems to tend toward opposite ends of the spectrum, rather than naturally tending toward entropy.

By doing peer reviews, you’re doing exactly that; you’re reducing the natural entropy in the system and preventing it from reaching its natural equilibrium by only letting the good code through, and keeping the bad code out. Over time, rather than tending toward a system where all code is average, you tend toward a system where all code is at the lowest end of the entropic spectrum.

Refactoring serves a similar, but more active role; rather than simply “only letting the good code through”, you’re actively seeking out the worse code and bringing it to a level that makes it acceptable to the demon. In effect, you’re reducing the overall entropy of the system.

If you combine these two effects, you can achieve clean, efficient, effective source. If your review process only allows code through that is as good or better than the average, and your refactoring process is constantly improving the average, then your final code will, over time, tend toward excellence.

Without a demon, any project will be on a continuous slide toward greater and greater entropy. If you’re on a development project, and it doesn’t have a demon, it needs one. Why not you?

Real Sprints

Agile methodologies talk about “sprints” – workloads organized into one to four week blocks. You schedule tasks for each sprint, you endeavour to complete all of it by the end of the sprint, then you look back and see how close your expectations (schedule) were to reality (what actually got done).

Wait, wait, back up. When I think of a sprint, I think short and fast. That’s what sprinting means. You can’t sprint for a month straight; you’ll die. That’s a marathon, not a sprint.

There are numerous coding competitions out there. Generally, you get around 48 hours, give or take, to build an entire, working, functional game or application. Think about that. You get two days to build a complete piece of software from scratch. Now that’s what I call sprinting.

Of course, a 48 hour push is a lot to ask for on a regular basis; sure, your application isn’t in a competition, this is the real world, and you need to get real work done on an ongoing basis. You can’t expect your developers to camp out in sleeping bags under their desks. But that doesn’t mean turning a sprint into a marathon.

The key is instilling urgency, while moderating burnout. This is entirely achievable, and can even make development more fun and engaging for the whole team.Since the term sprint has already been thoroughly corrupted, I’ll use the term “dash”. Consider this weekly schedule:

  • Monday: Demo last week’s accomplishments for stakeholders, and plan this week’s dash. This is a good week to schedule any unavoidable meetings.
  • Tuesday and Wednesday: your 48 hours to get it done and working. These are crunch days, and they will probably be pretty exhausting. These don’t need to be 18-hour days, but 10 hours wouldn’t be unreasonable. Let people get in the zone and stay there as long as they can.
  • Thursday: Refactoring and peer reviews. After a run, athletes don’t just take a seat and rest; they slow to a jog, then a walk. They stretch. The cool off slowly. Developers, as mental athletes, should do the same.
  • Friday: Testing. QA goes through the application with a fine-toothed comb. The developers are browsing the web, playing games, reading books, propping their feet up, and generally being lazy bums, with one exception: they’re available at a moment’s notice if a QA has any questions or finds any issues. Friday is a good day for your development book club to meet.
  • By the end of the week, your application should be ready again for Monday’s demo, and by Tuesday, everyone should be well-rested and ready for the next dash.
Ouch. That’s a tough sell. The developers are only going to spend two days a week implementing features? And one basically slacking off? Balderdash! Poppycock!

Think about it, though. Developers aren’t factory workers; they can’t churn out X lines of code per hour, 40 hours per week. That’s not how it works. A really talented developer might achieve 5 or 6 truly productive hours per day, but at that rate, they’ll rapidly burn out. 4 hours a day might be sustainable for longer. Now, mind you, in those four hours a day, they’ll get more done, better, with fewer defects, than an army of incompetent developers could do in a whole week. But the point stands: you can’t run your brain at maximum capacity eight hours straight, five days a week. You just can’t – not for long, anyway.

The solution is to plan to push yourself, and to plan to relax, and to keep the cycle going to maximize the effectiveness of those productive hours. It’s also crucial not to discount refactoring as not being productive; it sets up the following weeks’ work, and reduces the effort required to get the rest of the development done for the rest of the life of the application. It’s a critical investment in the future.

Spending a third of your development time on refactoring may seem excessive, and if it were that simple, I’d agree. But if you really push yourself for two days, you can get a lot done – and write a lot of code to be reviewed and refactored. In one day of refactoring, you can learn a lot, get important work done, and still start to cool off from the big dash.

That lazy Friday really lets you relax, improve your craft, and get your product ready for next week, when you get to do it all over again.

Zen Templates Development Journal, Part 2

Having my concept complete (see Part 0) and my simple test case working (see Part 1), I was ready to start on my moderate-complexity test case. This would use more features than the simple test case, and more pages. I really didn’t want to have to build a complete site just for the proof of concept, so I decided to use an existing site, and I happened to have one handy: rogue-technologies.com.

The site is currently built in HTML5, using server-side includes for all of the content that remains the same between pages. It seemed like a pretty straightforward process to convert this to my template engine, so I got to work: I started with one page (the home page), and turned it into the master template. I took all of the include directives and replaced them with the content they were including. I replaced all of the variable references with model references using injection or substitution. I ID’d all the elements in the master template that would need to be replaced by child templates. I then made another copy of the homepage, and set it up to derive from the master template.

I didn’t want to convert the site to use servlets, since it wasn’t really a dynamic site; I just wanted to be able to generate usable HTML from my template files. So I created a new class that would walk a directory, parse the templates, and write the output to files in an output directory. Initially, it set up the model data statically by hand at the start of execution.

All was well, but I needed a way for the child template to add elements to the page, rather than just replacing elements from the parent template. I came up with the idea of appending elements, using a data attribute data-z-append=”before:” or “after:”, to cause an element to be appended to the page either before or after the element from the parent with the specified ID. This worked perfectly, allowing me to add the Google Webmaster Tools meta tag to the homepage.

With this done, I set to work converting the remaining pages. Most of the pages were pretty straightforward, being handled just like the homepage; I dumped the SSI directives, added some appropriate IDs and classes, and all was well. However, the software pages presented some challenges. For one thing, they used a different footer than the rest of the site. It was time to put nested derivation to the test.

I created a software page template, which derived from the master template, that appended the additional footer content. I then had the software pages derive from this template, instead of deriving from the master template and – by some stroke of luck – it worked perfectly on the first try. I wasn’t out of the woods yet, though.

The software pages also used SSI directives to dynamically insert the file size for downloadable files next to the links to download them. I wasn’t going to reimplement this functionality, however, I was prepared to replace these directives with file size data stored in the model. But I wanted to keep the model data organized, so I needed to support nesting. The software pages also used include directives to include a Google+ widget on the pages; this couldn’t be added to the template, as it was embedded in the body content, so it seemed like a perfect case for snippets – which meant I needed to implement snippet support.

Snippet support was pretty easy – find the data attribute, look up the snippet file, parse it as an HTML fragment, and replace the placeholder element with the snippet. Easy to implement, worked pretty well.

Nested properties I thought would be a breeze, as I had assumed it was natively supported by StrSubstitutor. Unfortunately it wasn’t, so I had to write my own StrLookup. I decided that, since I was already doing some complex property lookups for injection, I’d build a unified model lookup class that could act as my StrLookup and could be used elsewhere. I wanted nested scope support as well, for my project list: each project had an entry in the model, that consisted of a name, latest version, etc. I wanted the engine to iterate this list, and for each entry, rather than replacing the entire content of the element with the text value of the model entry, I wanted it to find sub-elements and replace each with an appropriate property of the model entry. This meant I needed nested scoping support.

I implemented this using a scope stack and a recursive lookup. Basically, every time a nested scope was entered (e.g., content injection using an object or map, or iteration over a list), I would push the current scope onto the stack. When the nested scope was exited (i.e., the end of the element that added the scope), I popped the scope off. When iterating a loop, at the start of the iteration, I’d push the current index, and at the end of the iteration, I’d pop it back off.

This turned out to be very complex to implement, but after some trial and error, I got it working correctly. I then re-tested against my simple test case, having to fix a couple of minor defects introduced there with the new changes. But, at last, both my simple and moderate test cases were working.

I didn’t like the static creation of model data – not very flexible at all – so I decided to swap it out with JSON processing. This introduced a couple of minor bugs, but it wasn’t all that difficult to get it all working. The main downside was that it added several additional dependencies, and dependency management was getting more difficult. I wasn’t too concerned on that front though, since I was already planning for the real product to use Maven for dependency tracking; I was just beginning to wish I had used Maven for the prototype as well. Oh well, a lesson for next time. For now, I was ready for my complex test case – I just had to decide what to use.