Field notes: Unreasonable performance tests seen in the wild

Unreasonable performance tests seen in the wild

Enterprise software deployments are complicated. Every customer goes about the process differently. For some, performance testing in a pre-production environment is a requirement before deployment. These customers are generally very mature and realize that performance testing provides necessary data that will enable and qualify installation into production.

There are some customers who for whatever reason have invested in performance testing in their pre-production environments but haven’t done so consistently, or are ignoring some of the basic tenets about performance testing.

It looks good on paper…

I worked with a customer who assigned some Business Analysts the task of documenting a production workflow. These BAs didn’t fully understand the application or the domain, but went about their work anyway. They created a nice looking spreadsheet indicating workflow steps, and an estimate as to how long they thought the workflow would take to execute.

Actually, they didn’t create an estimate as to the start-to-finish elapsed time of the workflow. They documented the number of times they thought the workflow could be reasonably completed within an hour.

There’s a difference in measurement and intention if I say:

“I can complete Task A in 60 seconds.”

or if I say:

“I can complete Task A 60 times in an hour.”

I may think I can complete Task A faster and perhaps shave a few seconds of my 60-second estimate. Or maybe I think I can execute more tasks in an hour.

In this particular case, the count of tasks within an hour was nudged upwards by successive layers of review and management. Perhaps the estimate was compared against existing metrics for validity, but the results were presented in a table, something like:

Transaction Count Duration
Triage 600 10

At first glance this doesn’t seem unreasonable. Except that no units are provided with the values, and it’s unclear how the numbers actually relate to anything. Expressed as a sentence, here is what the workflow actually intended:

“Team leads will triage 600 defects per hour for 10 hours a day”

This data was taken directly without question by the team who was writing and executing the performance tests. They were told there would be several folks triaging defects so they created several virtual users working at this rate. You guessed it. The test and the system failed. Lots of folks got upset and the deployment to production date slipped.

 … but not in a calculator

Expecting any human to execute any triage task with software at a rate of one-every-6-seconds (10 per minute) for 10 hours without a break is madness. It may be possible that the quickest a person could triage a defect is 6 seconds, but this is not sustainable. Once the test requirement was translated into natural language, and folks realized that the rate was humanly impossible, the test was redesigned, executed and produced meaningful results.

How did this insane workflow rate get this way? Was it successive layers of management review increasing or doubling a value? Maybe when the table was created values were unintentionally copied and pasted, or multiplied. Regardless, the values were not checked against reality.

Are any tests are better than no tests?

I worked with another customer who spent the time and effort to create performance tests in pre-production, but didn’t follow through with the other tenets of good performance testing. Performance work needs to be executed in a stable, well-understood environment, and be executed as repeatably as possible.

Ideally, the work is done in an isolated lab and the test starts from a known and documented base state. After the test is run, the test environment can be reset so that the test can run again repeatedly and produce the same results. (In our tests, we use an isolated lab and we use filer technology so that the actual memory blocks can be reset to the base state.) Variances are reduced. If something has to be changed, then changes should be made one at a time.

This customer didn’t reset the database or environment (so that the database always grew) and they did not operate in an isolated network (they were susceptible to any corporate network traffic). Their results were wildly varying. Even starting their tests 30 minutes earlier from one day to the next produced wildly variable results.

This customer wasn’t bothered by the resulting variance, even though some of us watching were extremely anxious and wanted to root out the details, lock down the environment and understand every possible variable. For good reasons, the customer was not interested in examining the variance. We struggled to explain that what they were doing wasn’t really performance testing.

What can we learn from these two examples?

#1: Never blindly put your entire faith into a workload. Always question it. Ask stakeholders for review. Calculate both hourly counts as well as the actual workload rate. Use common sense. Try to compare your workload against reference data.

#2: Effective performance testing is repeatable and successive tests should have little variance. The performance testing process is simple: Document environment before test (base state), run test, measure and analyze, reset to base state, repeat. If changes need to be made, make them one at a time.

 

 

Wet Paint: Behind the Jazz.net Deployment wiki

The Deployment wiki

As previously alluded, we’re trying to change the model for how we communicate CLM and Rational product information specific to deployment. Presumably we could discuss what we mean by deployment, but the jazz.net Deployment wiki says it best:

“… a single repository for Rational deployment guidance and best practices for customers, partners, and Rational staff beyond the basic product and support documentation. The scope of the wiki is the full range of Rational products and solutions, but it focuses on advice for those that are most commonly deployed. The wiki assembles deployment information from many locations into a single community-developed and community-managed knowledge base.”

But really, what’s it for?

Having spent some time behind the curtain helping to get the wiki in order, I can speak to some of the behaviors we’re trying to improve:

  1. Bring accurate, relevant information to customers and practitioners
  2. Create a community where qualified experts can share their knowledge

We’re trying to consolidate the many different ways customers can find Rational product information, improve the shelf life of that information, and find a way to incorporate practitioner experience.

What’s wrong with the old model? (Well, nothing really…)

Back when, you’d open the box with the product diskettes and get a thick tome or two explaining how to install and use the software. Printed documentation seems to have gone the way of cassettes and eight-track tapes. Online documentation seems to be the current trend. Yes, trees are saved and cycle time reduced. I’m not completely sure that wikis represent the state of the art for documentation and knowledge management but this isn’t the place for that discussion.

Going about this the right way

No surprise that those of us working on the wiki are some of the most articulate of the bunch. However, we are a very self-conscious crowd even if we’re trying to impose our view of the world. We’re constantly asking each other to review what we write and collate. We want to get the facts right. We want to help our customers.

As we keep our goals in mind, there are a few things we worry about:

  1. Too much frequent, spurious change
  2. Readers misunderstanding or misusing what we say
  3. Making statements which are just plain wrong

Readers arrive at wikis well-aware that the content may be in flux. Still, we want to minimize trivial changes. At the deployment wiki there’s a wrapper of legal statements and terms of use which protect all of us. Much of what we publish is driven by common sense. We’re also constantly keeping an eye on what each of us do, not to police, but to be sure we’re doing our best work everywhere we can.

Humble beginnings

My work on the wiki started with the Troubleshooting Performance section. We assembled an excellent International team straddling development and customer support. At the end of February 2013, the Performance Troubleshooting Team locked itself in a room for one week. That we’d effectively take a clean sheet of paper and embark on a conceptually difficult topic was perhaps over-reaching, but we got an immense amount of work done. Supplied with generous servings of coffee, donuts and bagels, the team defined wiki structure, authored dozens of pages and figured out several necessary behaviors useful for the wiki’s more straightforward pages.

You might have thought we’d start on more direct material. That is, begin with “How to make a peanut butter and jelly sandwich,” rather than jump right into “Let’s figure out why you’re having trouble making a sandwich.”

Very often, the best instructions are sequential:

  1. Desire a specific result
  2. Assemble ingredients
  3. Follow sequential steps
  4. Arrive at result

Inventing theory

Troubleshooting presumes something, anything, may have gone wrong, and requires a diagnostic vocabulary. We had to write pages on how to solve problems, before we even picked the problems we wanted to solve. We had to agree on how to frame issues so that readers could use our pages to comprehend their own concerns and use our pages for help.

  1. Desire a specific result
  2. Assemble ingredients
    1. How do you know if you don’t have all the ingredients?
    2. Might some of the ingredients be faulty?
  3. Follow sequential steps
    1. How do you know if you missed a step?
  4. Arrive at result
    1. How might you determine you haven’t achieved the result?

We settled upon situations as the name for the things we were trying to help solve and write about. We tried to come up with reasonably narrow situations, and applied some of our Agile techniques to frame situations:

“As a <user> trying to do <X action>, I encounter <Y performance problem>.”

We discovered that situations fell into two categories: situations which administrators were most likely to encounter (not necessarily product specific, more likely to be environmental) and situations which users were most likely to encounter (more likely to be product specific, not necessarily environmental).

Given performance problems often appear to end-users as a culmination of several problems, we decided that the situations we would discuss had to be discrete and specific: “How to troubleshooting performance of <feature>,” or “How to diagnose performance problems in <non-product component within an environment>.”

Because the pages are on an easily discoverable public wiki, the material had to be reasonably self-contained with clearly defined scope. We also had to add hints for where readers would go next should our pages not actually prove to be helpful.

We thought a lot about who our audience might be: The administrator situations and user situations seem obvious now, but they weren’t our original design.

Admittedly, maybe I worried more about theory than most. Or at least, the team let me worry about it while they were doing lots of research and writing.

Actual practice

Many of the situations required researching and writing background pages to explain key concepts even before troubleshooting started. A page on performance troubleshooting ETLs required explaining how to read ETL log files. A page on slow performance in web browsers required material about how to measure browser performance independently of CLM. A discussion of troubleshooting virtualized configurations required an introduction to virtualization concepts.

Each situation proposes an initial assessment to identify symptoms, impact/scope, timing/frequency and any other environmental changes. These are the basics of troubleshooting. Then there might suggestions about how to collect data, use tools, or complete an analysis. Finally there are the paragraphs readers expect where possible causes and solutions are itemized. Not every cause and solution applies to everyone. We didn’t want to create a greater burden for support by suggesting things that were wrong, irrelevant or confusing.

As mentioned above we also wanted to make sure readers could find their way to support, other IBM and generally available material. As much as we worried about a top-down structure (entry pages, secondary pages, etc.), we are well-aware that folks might arrive at a page from an Internet search.

For consistency, we created a template for writing our pages. At the end of our troubleshooting template are Ground Rules. They look simple, but they are easily internalized and effective.

Ready right now, and preparing for the future

We’re desperate for feedback. Each page has a comments section which we monitor. We’re considering updating the wiki platform with some “like” capability, as well as tie it closer to jazz.net’s forum.

We created a page where in-progress articles can be found, and suggestions for potential troubleshooting topics can be made.  So please visit the wiki and let us know what you think.