[An earlier version of this note appeared in Verilab's internal newsletter. It was written just after version 6 of Fog Creek's FogBugz had come out. I hope the observations still make sense.]
The new "intellectual" content of version 6 of (Joel Spolsky's place...) Fog Creek's FogBugz was "Evidence Based Scheduling". I thought it an interesting development beyond What I Already Believe In (TM), so I'll give it a plug.
What do I believe in about planning/scheduling?
People hate doing it. Less is more. (That said, chip people are among those most dedicated to/skilled at this onerous task.)
Schedule estimates are almost always short... There are a thousand possible reasons. Let's move right along...
People are bad at estimating how long something will take, and it gets worse the bigger the task. Any estimate for a task much beyond a couple of days: all noise, no signal.
On the other hand, we estimate bite-sized tasks pretty well. "Can you have that letter done in a half hour?" Not a problem.
We estimate "I have never done that before" tasks completely appallingly. Fair enough.
People vary: some estimate tasks well, others poorly. Some will learn to do it better; others won't.
The main relevant data we have (at least in our line of work) is "What were your time estimates for previous tasks and how long did they really take?"
An approach to schedule estimates that draws on only those Great Truths is interesting to me.
One that is now quite well known is Scrum. (There are many related schemes in the "agile family".)
In Scrum, you have a project "backlog" of Things We Might Do. For each project iteration (a "sprint" of a week or two...), items are drawn from the backlog and refined into "stories", and stories are given estimates of how long they will take to complete. There is a maximum story size, say four or eight hours. Estimates are in "points", with one point often equaling one hour.
Scrum only cares about completed stories, and only cares about team results -- there is no tracking of individuals' completion rates. In fact, the only things that are tracked are: (a) During the sprint, how many hours of work are still left? (the "burndown chart"); and (b) How many points' worth of completed stories did we finish in the sprint?
In setting out on a new sprint, a team is only allowed to take on as many points as it finished in the previous sprint. There is no living in hope -- "This time we'll do better!"
The Scrum planning notion is brainlessly simple. Maybe that's why I like it.
OK, what about Fog Creek's Evidence Based Scheduling (EBS)? Well, we track individuals' efforts for a start. For each person, we have a list of tasks they've done (bite-sized, as before). For each of those, we know the estimate that was made at the beginning (by the person about to do it) and how long the task actually took. We can then trivially compute a ratio of estimated/actual for each task. Spolsky calls this velocity. Exactly one is perfect estimation, low is "bad" (but normal), and high is unheard of. In some ways, consistency is what counts: you can plan around someone whose velocity is always low. Someone whose velocity wanders all over the shop is a planning problem.
How do we estimate a new swatch of (bite-sized) tasks? First, we assign tasks to people. (Not the only way forward, but it'll do for this discussion.) Next, we let those people make their estimates for how long their tasks will take. Also, for each of those people, we have a list of their velocities on recent work. (Maybe we get clever and drop the best and worst; I don't think it really matters.)
Finally, the cool-sounding part -- a "Monte Carlo simulation"! What this means is: for each task, pick a random velocity item from the task-doer's recent task-completion ratios, and multiply their estimate for this task by that velocity. Do this for all tasks, add up the times, and that's when the job will be done.
But the Monte Carlo bit is that you repeat this random-picking exercise many times (let's say a hundred), and you plot all of the (hundred) estimates on a graph, as follows.
The x-axis is time, as in "We'll be finished by then". The y-axis is "Likelihood that...". So the smallest estimate you got from the 100 simulations would be the "1% likely" time, the second-smallest the "2% likely", and so on.
(Note: Spolsky's got nicer pictures and is more fun to read.)
What you want is a nice steep curve, indicating that no matter how you dip into the velocities, you get a very similar answer. A gently-sloping curve says "We don't have a clue when we'll finish."
Summary: I mostly see similarities between the old Scrum way of estimating and the new-fangled Evidence-Based Scheduling. Both require the heavy lifting of breaking work down into realistically-small pieces which may be plausibly estimated. They both require keeping fairly-light timesheet information on time-per-task.
The EBS twirl with the Monte Carlo simulation is a nice touch. Let's say just one out of forty tasks went completely pear-shaped (estimate-wise) last time. Scrum would say, "Tough cookies, it still directly affects how many points you can do this time." EBS would say, "Calm down -- there's only a 1-in-40 chance of the disaster repeating itself; factor it into the graph but don't let it dominate."
(Finally: read Spolsky's original if you have not done so before. There's more there that I haven't discussed here.)
