Issue: metrics for tester productivity?
In response to my Baseline columns on metrics (Part 1, Part 2, and Part 3), I received the following e-mail:
I read your column with great interest as I’m involved on an IT project to measure productivity. May I ask you a quick question? Are there any mature metrics that can measure tester productivity improvement month by month and accurate to 1%?
Here’s the response I sent back:
Well, for starters you have to define what you mean by “tester productivity.” Number of test scripts run? Number of defects found? Number of defects closed? Number of defects reopened? (And do you weight the “defects found/closed/re-opened” by criticality and/or severity?) Number of reported defects replicated? Number of hard-to-replicate, yet critical/severe defects that can now be replicated (and thus fixed)? Some combination (possibly a weighted function) of all of the above?
In other words, what is it exactly that you’re trying to accomplish? To make your testing team more effective? More efficient? To shorten the test cycle? To spend less on testing? To close more defects (and defer fewer open ones) for each system release? To have fewer defects discovered after a system release? Jerry Weinberg says that “quality is value to some person.” Who are the people you’re worrying about, what qualities — functionality, performance, reliability, etc. — do they value, and to what extent?
Once you’ve defined all that, there still remains the question as to whether you can measure that to a 1% accuracy (or even a 10% accuracy) month over month, and still preserve any meaning in that measurement. It’s possible (and common) in metrics to have “false accuracy” — you believe you’re actually measuring something to a certain precision, but you’re mostly just reading random or insignificant noise at that level.
Finally, we come back (as always) to Weinberg’s law of metrics: that which can be measured can be fudged (or exploited). For example, read this story over at the Daily WTF: The Defect Black Market.
Hope this is of some help, though I tend to doubt it. 🙂
Thoughts from the rest of you? ..bruce..
Bruce, this is a super answer, and not just to this specific question, but to an enormous number of similar questions I receive almost every week (and I’m sure you do, too).
I would add one question to your questioner, up front. It’s implicit in your answer, but I like to ask it explicitly:
Suppose you had this measurement, what do you intend to do with it?
Frequently, I get answers that amount to:
– If I had this measurement, I’d know which testers to punish. (or reward, which amounts to the same thing, since the unrewarded testers will feel punished.)
I don’t bother to answer these people.
Most of the other answers amount to:
– If I had this measurement, I’d know how to improve testing.
To these people, I can usually look at what they’re doing now (eyeball measurement) and give them at least half-a-dozen suggestions that will improve their testing noticeably before they start looking for 1% improvements.
Gerald M. Weinberg
http://www.geraldmweinberg.com
Jerry:
Thanks for the observations and even more for the kind words. And your observations, as usual[*], are dead-on.
One of the most important things I’ve learned from your writings — starting with my original copy of The Psychology of Computer Programming, which I bought in the late 1970s as a recently minted CS graduate — is the profoundly human nature of software engineering and IT project management. Organizations keep trying to treat it like a manufacturing or chemical engineering process; I think it’s more like a sports team or an orchestra.
Oh, and I hope you don’t mind the “Weinberg’s Law of Metrics” that I coined for my column and re-used in the posting above. 🙂 ..bruce..
[*] Except, of course, when I choose to disagree or pick a fight with you. 🙂