What if? driven development

Mar 07, 2022

When programming, I tend to be a typical flow which roughly goes: problem definition → thinking → typing. This is a summary of how I approach problems, with a catchy title I’ve gone with “What if? driven development”.

The five whys

The five whys is a technique for discovering the root problem, from some symptoms. Simply put: given a problem, if you ask the reporter why five times, you will discover the root of the issue.

An example, taken from wikipedia:

The problem: the vehicle will not start.

Why? – The battery is dead. (First why)
Why? – The alternator is not functioning. (Second why)
Why? – The alternator belt has broken. (Third why)
Why? – The alternator belt was well beyond its useful service life and not replaced. (Fourth why)
Why? – The vehicle was not maintained according to the recommended service schedule. (Fifth why, a root cause)

When fixing the car, you probably only need to get to the 3rd why - at that point, you know that the alternator is broken, and the battery is empty. But to prevent the car breaking in future, you need to proceed to 4 and 5, to discover that the car needed more regular maintainence. This information won’t help this particular car, but it will prevent the same problem happening again in the future, assuming the owner heeds the warning.

Programming

We can apply the same lesson to programming: if we encounter a bug once, perhaps we’d put a temporary fix in place. Imagine you have a fault where you’re trying to extract `id={id}`from a string, but for some reason it’s picking up anything after the `=`. You might hae the regex like this:

Which gives you the following two cases:

id=1, correctly parsed to 1
id=1&name=a, incorrectly parsed to 1&name=a

Following the five whys, you’d start with the problem statement ids are invalidly parsed

Why? — It should only contain the value 1
Why? — It should ignore values past &
Why? — Because & is the field separator

Here, we landed at the root cause after 3 whys. Our parser did not correctly handle field separators. We could fix that pretty easily, with a regex like:

Let’s imagine you might have multiple ids provided, but only want to get the last id. In Python you can use the findall function to repeat a search. But you probably want a proper parser, as the complexity grows. This is the fundemental idea behind what if? — start small, with code that passes some conditions. Then question yourself on edgecases. It’s not a revolutionary idea, and approaches like test driven development are very similar. But I like to think about it at a higher level than code — rather than what test can I make?, I prefer to frame it as what weird things might happen?

Language choice

There are some languages which naturally support this workflow. I tend to always begin with types, when I’m coding. Consider the above problem, but each id is attached to a request. In Derw, it might look something like:

Now that we know how a request might look, let’s think about the id parser.

The naive approach might be to assume that we will always be able to parse an id from a parameter string, giving us something like:

Now there should be a couple of what ifs that come to your mind. First, what if there isn’t an id in the string? Well, we can deal with that with a Maybe - an optional type which represents something either existing, with the value it has, or something not existing.

The second what if should be what if the id is there but isn’t a number? You could represent that case with the same Nothing value from the Maybe, but that doesn’t give you great error messages. Consider the following function for handling request parsing:

Here, our error message on an invalid parsing attempt of id is obtuse. It’s not clear what went wrong. We can move this error message handling into the getId function, using a Result type, so it looks something like:

Here we have useful error messages, but what if we wanted to handle the error states with from a calling function? Perhaps we should have different behavior between a missing id and an invalid value for an id. We could actually raise this to the top scope instead of handling it in this function, representing it with a union type, giving us the complete program as below. This will help us handle the different error states, should we have some fallback behaviors instead of purely giving a string error.

Note that this whole program was guided by the types - first we begin thinking about what a valid Request is, then how our parsing function will handle the error cases, and finally how our function that users and developers will use works.

All this, without touching any tests. I didn’t need to - because the types provided the direction. I had to account for the error case, so I had to use Maybe. I had multiple errors and wanted to provide useful output, so used a Result. Then I wanted to abstract out the error message to the top level function that will be exposed, so provided errors as an error state. Programming in this way is great in ML languages like Derw, and can be done in languages like Typescript or Python with a bit more boilerplate and less guidance from the compiler. But generally, I push other developers to think in types, rather than implementations.

Inside Aftenposten

Discussion about this post