Field Notes: Getting Started With Performance Testing, Step 734

Sometimes I’m fortunate to work with customers and teams who are interested in setting up performance testing, either of their own products or ours. If they have little experience, they may be unsure of how to begin. Usually these conversations start with basic project management questions:

  • What are the goals?
  • What is the timeframe?
  • What are your skills?
  • What equipment can you use?
  • What is the budget?
  • Etc.

Recently I was working with a group that was well on their way towards performance testing nirvana. (Such a thing exists. Some of us have seen it. It’s blue at the edges.) This group and I had a different type of conversation which centered around how to ensure that the great work they were doing would be relevant and meaningful to their customers.

I think we all have examples of this: We show a detailed graph to someone and they can only comment on the colors, completely missing the revolutionary earth shattering facts you’re trying to prove.

This team I spoke with is a very technical team. They have a very technical product. It’s not deployed causally, and folks who use it are well-aware they’re making an investment in setting it up and getting results from it.

Right now they’re working towards a set of automated performance tests (Yay!) and a corresponding set of expected result ranges. The test is very specific: “A batch of X actions complete in Y time.” This is useful for the team because if the results vary (Y increases or decreases) they know something has changed. Right now this test is useful for them.

I suggested that they consider trying to restate this test in the context of what it might mean to a customer. When might a customer have X of these actions? Would this be a customer just starting out, or a group well into the tool’s adoption? Is this a normal task, or might it be something done infrequently? Why should this test and the results give a customer confidence?

A car manufacturer may giggle with delight that a particular component rotates at 74,000 rpm, but the consumer might care more that the component contributes to overall reliability and that the car will start and stop when required.

(We know car makers have those glossy brochures with torque specs and so forth, but really, how many of us, the first thing we ask, is what colors can I get?)

Back to our technical example, “A batch of X actions complete in Y time” could be expressed as: “50 people’s builds execute in less than one hour,” or “The average work produced by a development team of 50 engineers can run overnight.” Etc.

I also suggested that now would be a good time to capture as much detail about the circumstances of the tests they’re doing: the hardware, how much data was moved back and forth, average CPU percentage consumption, maximum memory consumption.

There may be a point at which they may have to to extrapolate that test’s behavior to another environment (“Will this different hardware and different deployment conditions still achieve X actions completing in Y time?”) and having all the extra data might mean not having to rerun a test. Or perhaps it might identify another dimension to this particular successful test that was overlooked.

As much as I suggested defining a set of measures for the test, I warned against forgetting how the customer might perceive the test.

Another car example: Cargo area is most always indicated in those glossy brochures. While we can tell that 70 cubic feet is bigger than 68 cubic feet, the actual measure of cubic feet is not something most of us encounter frequently enough to know just how big 70 cubic feet really is, and just what fits in such a space. With the rear seats down my car purports to hold 70 cubit feet. However. when I’m loading up at the hardware store, all I care about is whether I can fit a few 2x4s inside.

(A 2×4 is a piece of standard piece of lumber, usually measuring 1-1/2″ x 3-1/2″ x 8′. The problem is that it’s eight feet long and shouldn’t bend, but should somehow fit diagonally. I can’t tell this from a brochure measurement. Only from real life. Of course my family is the one that brought an empty full-size cello case to see how it would fit, but that’s another story.)

As much as I think and dream about software performance, at the end of the day, any technical thoughts and results must be translated in to true customer-comprehensible meaning. “A batch of X actions complete in Y time” is exciting for the performance professional, but such measures have to be translated into meaningful customer statments.