Performance Driven Development, footnotes and asides

Ever since I wrote up some thoughts on Performance Driven Development (, I have been treating it like a real thing. It’s reassuring to see it climb towards the top in Google search results, but that may be the result of folks in my organization trying to figure out what I’m talking about.

I can’t lay claim to being the first to string together “performance,” “driven” and “development,” even if I did have a specific intent. Those three words have been grouped together before, sometimes even in the way that I intended. Most prior authors would seem to coalesce around the idea that PDD is about developers creating performant products. However, there are two subtle but distinct camps: 1) Those who think that PDD is achieved through late-phase tooling and testing (in my opinion, this is noble, sometimes meaningful work, but antiquated and inefficient); and 2) Those who envision redefining a whole organization’s development goals by positioning performance design at the center of what they do. Yes, forget features and function: No one will use it if it isn’t fast, and you can’t fix slowness unless you’ve already built-in the ability to measure.

In an article published in 2009, Rajinder Gandotra nicely expresses the latter point, anticipating my definition of PDD: “The need is to move to the next paradigm of software engineering that can be called as ‘performance driven development.’

Performance focus through a performance engineering governance office is suggested which would help govern performance right from the business requirements gathering stage till go-live stage in production.” (SETLabs Briefings, Vol 7, No. 1, 2009, pp. 75-59. Ironically, the easily accessible pdf is neither dated or useful as a citation:

A performance engineering governance office might not imply actually producing instrumented code, but it does suggest end-to-end performance requirements. Given I’m thinking of PDD as part of TDD and BDD, the idea of an office suggests the performance governance is occurring in parallel or from the outside, not within and integral as we might try to do with Agile.

An unsigned article (I gave up looking for an author) from May 7, 2012, relates PDD to BDD and TDD as I have done, and proposes iteratively revealing a product’s internal KPIs so that they are exposed and visible, and that the organization can make conscious decisions about what needs improvement or investigation. Sign me up. This goes back to the point that developers will tend to investigate outliers or strange patterns if they are concretely measured or made visible. There’s also a blog post out there (, dated the same day) with a nice Max Headroom graphic that summarizes the anonymous author’s points, including create and update a dashboard which exposes KPIs. There is a lot of similarity in what I have been proposing in this post. A whole organization thinking about performance can be pretty powerful, and if a team creates KPI dashboards, that could be the way to ensure that software performance remains foremost.

(Speaking of outliers, what’s the chance that these two items showing the same date are not somehow related?)

Others who have invoked Performance Driven Development seem content to think about applying it in isolation, as a way to improve an individual’s process which in turn would lead (idealistically) to product improvements as a whole. For example, Michael J. A. Smith ( uses “Performance-Driven Development” as a title to describe a process of improving code through models and simplification. Honestly, this single page pdf is more like a poster illustrating an idea rather than a specific design pattern for creating software where performance considerations are foremost. I admit that I may be missing its point entirely.

And there’s a tool out there,, whose descriptive materials take the phrase “Performance Driven Development” out for a walk. I am not clear on what ngrinder actually does or how it works.

I can’t lay claim to Performance Driven Development as an original thought. I maintain it is a very compelling way to tackle performance in software. Stay tuned for specific examples that have worked for our organization.


Performance Driven Development

Say what you will about Test Driven Development (TDD), but it was an eye opener for some of us the first time we heard about it. Sure practically it’s not always going to happen despite the best of intentions, but it did reframe how some developers went about their work and how some organizations ensured a modest degree or even slightly increased amount of testing earlier in the lifecycle. At an abstract level, thinking about TDD improved quality outcomes regardless of the tactical steps.

Recently, trying to describe the characteristics and skills needed for a new development team, I joked that we needed Performance Driven Development (PDD). I got some light agreement (mostly of the smiling emoticon variety), and at least one “Aha, that’s an awesome idea!” for each “That’s totally stupid. You clearly don’t understand either development or performance.” Yeah, OK. Sometimes good ideas seem stupid at first.

(Of course sometimes stupid ideas seem stupid at first but smart later, but then again, I probably shouldn’t point that out.)

Well, as a thought experiment, Performance Driven Development is a chance to express several qualities I want in the new organization we’re trying to build. We want performance engineering integrated into the product creation experience. Not a throw-it-over-the-wall-afterthought. We do not want a separate performance team, self-organizing, working on a not-current release, using a synthetic workload, and employing custom-made testing tools. Yeah, we’ve all done that. Some of us got really good at it. There’s a lot of work to do in that environment, lots of good problems to find and solve. But it’s frustrating after awhile.

Let’s consider the end result in those cases. Performance really was an afterthought. Code was baked and delivered, and then a team would come along later to “test at scale” or do “load testing.” A necessary, scary bunch of tasks was delayed to the end when it was perhaps to late to do anything about it. Kind of like the way testing was before TDD.

If we’re trying to improve performance, TDD becomes an instructive model. We want the performance specialists (no longer an independent team) to be integrated into the development team, we want them to work on the most current release, we want them to use real customer workloads, and we want the tools they use to be part of the development experience or easy (easy-ish) to use off-the-shelf tools.

Performance Driven Development means that at every step, performance characteristics are noted and then used to make qualitative decisions. This means that there is some light measurement at the unit level, at the component level, etc., and that measurements occur at boundaries where one service or component hands off to another, and that the entire end-to-end user experience is measured as well.

(Skeptics will note that there’s no mention of load, scale or reliability yet, so bear with me.)

The first goal is that the development team must build not just features but the capability to lightly measure. By measure I mean timestamps or durations spent in calls, and where appropriate and relevant, payload size (how much data passed through, and/or at what rate, etc.). By light, I mean really simple logging or measurement which occurs automatically, and which doesn’t require complex debug flags being set in an .ini file or a special recompilation. The goal is that from release to release as features change it will be possible to note whether execution has slowed or speeded, and relatedly whether they are transferring more or less data.

Unit tests would not just indicate { pass / fail }, but when run iteratively { min / avg / mean / max / std } values for call time or response time. Of course we’re talking about atomic transactions, but then again, tasks that are lengthy should be measured as well. (Actually, lengthy tasks demand light measurement to understand what’s occurring underneath, the rate of a single-threaded transformation, or the depth of a queue and how it changes over time, etc.)

I’ve also been talking of performance boundaries. These are the points where different systems interact, where perhaps data crosses an integration point, where a database hands off to a app server, where a mobile client transmits to a web server, or in complex systems where ownership may change. These boundaries need to be understood and documented, and crucially, light measurement must occur at those points (timestamps, payload, etc.).

Finally, the end-to-end experience must be measured. This may align more typically with the majority of performance tools in the market (like LoadRunner and Rational Performance Tester) which record and playback http / https, simulating end-user behavior in a browser. There are many ways to measure the end-to-end user experience

In the PDD context, the performance engineer is responsible for ensuring that light measurement exists at the unit, component and boundary levels. The performance engineer assists the team in building the tools and communal expertise to capture and collect such measurements. The performance engineer leads the charge to identify when units or components become slower or the amount of data that passes through them has increased. The performance engineer will also need to use tools like Selenium, JMeter and Locust to produce repeatable actions that can be measured either of the single-user or multiple-user variety.

I know plenty of really good developers who once you show them conclusive evidence that the routine which they just finished polishing now increases a common transaction by 0.01 second, will go back to the drawing board to get that 0.01 second back and more. They’ll grumble at first, but they’ll be grateful afterwards. Building that type of light measurement can’t be done after the fact. It must be designed from the beginning. And if I’m being idealistic about a super-developer discovering extra time in code, then we all realize that enhancing features intrinsically gets expensive. Adding more features generally makes things slower, that’s a given tradeoff.

And so what about load and reliability and scalability and all those other things? I’ve spent a lot of time working with complex and messy systems. Solving performance problems and building capacity planning models invariably involves defining and building synthetic workloads. I don’t know how many times I’ve been in situations where the tested synthetic workload is nothing at all like what occurs in real life. I have grown convinced that dedicating a team of smart people to devising a monstrous synthetic workload is misdirected. Yes, interesting problems will be discovered, but many of them will not be relevant to real life. Sure this work creates interesting data sheets and documents, but rarely does it yield useful tools for solving problems in the wild.

(You could argue that the performance engineer quickly learns the tools to solve these problems in the wild, which actually is a benefit of having a team doing this work. However, you’re not improving the root cause, you’re actually training a SWAT team to solve problems you’ve made difficult to diagnose to begin with.)

So then how do you simulate real workloads in a test context? I honestly think that’s not the problem to solve anymore. I know this terrifies some people:

“What do you mean you can’t tell me if it will support 100 or 1000 users?!”

I know that some executives will think that without these user numbers the product won’t sell, that customers won’t put money down for something that will not support some imagined load target. (I’ve muttered about how silly it is to measure performance in terms of users elsewhere.)

Given that most of our workloads involve the cloud, or if they don’t, then increasing either virtual or physical hardware capability is relatively inexpensive, then the problem becomes:

“How does overall performance change as I go from 100 to 200 users, and what can monitoring tell me about what that change looks like so I can predict the growth from N to N+100 users?”

Yes you want your car to go fast, but if you mean to keep that car a long time, you learn how it behaves (the sounds, the responsiveness) as it moves from second gear to third gear, or fourth gear to fifth gear.

The problem we need to solve in the performance domain isn’t about how to drive more load (although it can still be fun to knock things over), the problem to solve is:

“How do you build a complex system which reports meaningful measurement and monitoring information, and is also serviceable so that as usage grows, light measurement provides clear indication of how the system is behaving realtime under a real workload.”

This is what I’m trying to get at with the concept of Performance Driven Development. (I won’t lay a claim to creating the phrase. It’s been out in the wild for a while. I’m just putting forth some definitions and concepts which I think it should stand for.)

Supermarket Circulars in Vegas

The IBM Rational user conference formerly known as Innovate was folded into a new IBM conference this year called InterConnect. During the last week of February customers and IBMers descended upon Las Vegas. We weren’t there to talk about connecting hi-fi components, like turntables to amplifiers (those sorts of interconnects, remember?), but about software and hardware and business and all the sorts of things IBM is into.

This was your author’s first Las Vegas experience. Way back when, Las Vegas was a thing to study and consider in the abstract, the “architecture of disruption.” Needless to say, nothing quite prepared for the thing itself. Navigating the conference hotels and weaving through the Vegas strip was like trying to play Twister on a constantly changing supermarket circular. (Yeah, I know that makes no sense, but it illustrates my point.)

This year I spoke on Performance, Monitoring, and Capacity Planning for the Rational CLM Solution. The discussion highlighted performance updates delivered to the CLM products across the last few releases, and then moved on to a topic which I’ve been talking about in various contexts for quite some time. Customers use products and they want to be happy. If they use the products, that generally means that there will be a gradually increasing population with a gradually increasing amount of assets. As usage increases, capacity can dwindle, and sometimes performance suffers as a result. To address capacity planning, you must understand current usage and behavior which requires monitoring the system at hand. In essence, ensuring a top-quality performance experience requires monitoring.

There was a detour to talk about our CLM Sizing Strategy which I’ve pointed to here before and which took a long time to document (primarily because it contains a well-researched list of caveats and details on why sizing can be so difficult), but I’m pleased how it finally came together.

Discussing product improvements and fixes is important, as it shows our commitment to releasing excellent products and when we don’t get it quite right the first time, our ability to react and improve. We’re also encouraging customers tackle performance in a proactive way, by understanding existing system behavior and noting how a complex system changes over time.

See you next year? Well, more on that fairly soon. I admit there’s a certain sense of satisfaction to having “survived” Vegas’ absurd scale, disorienting patterns, and intentional obfuscation. Yeah, I’ll try it again if asked.

Looking ahead to InterConnect 2015

InterConnect 2015, in Las Vegas, NV, is a week away. The official conference starts on Monday, February 23, and pre-meetings and whatnot commence the day before on Sunday, February 24.

I’m looking forward to being there. One reason is because I’m sitting in a Boston suburb wearing extra layers including one of my thickest wool sweaters, sheepskin slippers and a scarf. I rarely ever wear a scarf in the house, but it is so cold. Maybe you saw in the news that we’re having a little problem with snowfall this year. Colleagues comfort me by pointing out how unseasonably warm it is for them in the great states of Washington, Texas, Florida and Colorado, to name a few. They mention two-digit temperatures well above freezing which have become rarity ’round here. But no one wants to hear someone complain about seven-foot snow drifts, so let me get back on topic.


My main-tent session this year is DRA-2104: Performance, Monitoring, and Capacity Planning for the Rational CLM Solution. It’s on Monday, Feb. 23 at 3:30 pm in the Islander Ballroom B at Mandalay Bay. Because this is the first year the formerly-known-as-Rational-Innovate-conference has become part of a larger conference and moved from Orlando, FL, to Las Vegas, NV, I will get there early. Because I don’t know where anything is. And the helpful messages from the conference management folks suggest that it takes at least 30 minutes to get from site to site.

DRA-2104: Performance, Monitoring, and Capacity Planning for the Rational CLM Solution will talk about all the awesome performance improvements that are in the CLM 5.0.2, 5.0.1 and 5.x releases. Some improvements made back in 4.x are so good, they will bear mentioning again.

I’ll also talk about our CLM Sizing Strategy and how proper monitoring of an existing system can lead towards an understanding of capacity planning. There will be time for questions, and discussions about the local weather.

My presentation is part of the larger Rational Deployment for Administrators track. If you are attending InterConnect and can login to the event portal, will take you to the entire track schedule.

We made slides for ourselves to cross-promote each others sessions:



On Wednesday, I’ll be moderating DRA-1970A: Best Practices for Using IBM Installation Manager in the Enterprise which has a great line-up of Installation Manager practitioners who are dying to share their experiences.

It’s time for me to get outside and shovel a bit. I hope to see you in Vegas next week. Be sure to say “Hi!”


JTSMon 5.0.2 is here

JTSMon 5.0.2 is now available for downloading from the Deployment Wiki FAQ site (

  • Some facts to note with this new release:
    The appearance in CLM 5.0.2 of “scenario” (client-side use-case) based web service reporting will cause problems with earlier releases of JTSMon. The new version reads the new format reports accurately though it does not yet take advantage of scenario data.
  • A new 5.0.2 jazzdev baseline is included for comparison to user collected data.
  • RQM web service data is better broken down now for system component reporting.
  • Post-monitoring analysis can now be focused on a subset of the collected data.
  • Several additional defects have been fixed.

If you have questions or comments, please ask them at the forum. We’re using the jazzmon tag there.

CLM 5.0.1 datasheets

Collaborative Lifecycle Management (CLM) 5.0.1 was announced and released this week. For Rational Team Concert (RTC) 5.0.1 there were substantial improvements made to the Plan Loading capability which are outlined in Plan Loading Performance Improvements in RTC 5.0.1.

Other 5.0.1 datasheets include:


For the CLM 5.0.1 release, performance testing validated that there were no regressions for RTC and RQM between the 5.0 and 5.0.1 releases, therefore there has not been a need to create 5.0.1 datasheets for those products. For 5.0.1 performance information for those products please consult the 5.0 datasheets.

Two new CLM 5.0 datasheets on the Deployment wiki

A quick note about two recent performance datasheets and sizing guidelines published on the Deployment wiki:

Just posted is the Rational Engineering Lifecycle Manager (RELM) performance report 5.0 by Aanjan Hari. This article presents the results of the team’s performance testing for the RELM 5.0 release and deployment suggestions derived from the tests.


Also recently posted: Sizing and tuning guide for Rational DOORS Next Generation 5.0 by Balakrishna Kolla. This article offers guidance based upon the results of performance tests that the team ran on several hardware and software configurations.


Field Notes: Getting Started With Performance Testing, Step 734

Sometimes I’m fortunate to work with customers and teams who are interested in setting up performance testing, either of their own products or ours. If they have little experience, they may be unsure of how to begin. Usually these conversations start with basic project management questions:

  • What are the goals?
  • What is the timeframe?
  • What are your skills?
  • What equipment can you use?
  • What is the budget?
  • Etc.

Recently I was working with a group that was well on their way towards performance testing nirvana. (Such a thing exists. Some of us have seen it. It’s blue at the edges.) This group and I had a different type of conversation which centered around how to ensure that the great work they were doing would be relevant and meaningful to their customers.

I think we all have examples of this: We show a detailed graph to someone and they can only comment on the colors, completely missing the revolutionary earth shattering facts you’re trying to prove.

This team I spoke with is a very technical team. They have a very technical product. It’s not deployed causally, and folks who use it are well-aware they’re making an investment in setting it up and getting results from it.

Right now they’re working towards a set of automated performance tests (Yay!) and a corresponding set of expected result ranges. The test is very specific: “A batch of X actions complete in Y time.” This is useful for the team because if the results vary (Y increases or decreases) they know something has changed. Right now this test is useful for them.

I suggested that they consider trying to restate this test in the context of what it might mean to a customer. When might a customer have X of these actions? Would this be a customer just starting out, or a group well into the tool’s adoption? Is this a normal task, or might it be something done infrequently? Why should this test and the results give a customer confidence?

A car manufacturer may giggle with delight that a particular component rotates at 74,000 rpm, but the consumer might care more that the component contributes to overall reliability and that the car will start and stop when required.

(We know car makers have those glossy brochures with torque specs and so forth, but really, how many of us, the first thing we ask, is what colors can I get?)

Back to our technical example, “A batch of X actions complete in Y time” could be expressed as: “50 people’s builds execute in less than one hour,” or “The average work produced by a development team of 50 engineers can run overnight.” Etc.

I also suggested that now would be a good time to capture as much detail about the circumstances of the tests they’re doing: the hardware, how much data was moved back and forth, average CPU percentage consumption, maximum memory consumption.

There may be a point at which they may have to to extrapolate that test’s behavior to another environment (“Will this different hardware and different deployment conditions still achieve X actions completing in Y time?”) and having all the extra data might mean not having to rerun a test. Or perhaps it might identify another dimension to this particular successful test that was overlooked.

As much as I suggested defining a set of measures for the test, I warned against forgetting how the customer might perceive the test.

Another car example: Cargo area is most always indicated in those glossy brochures. While we can tell that 70 cubic feet is bigger than 68 cubic feet, the actual measure of cubic feet is not something most of us encounter frequently enough to know just how big 70 cubic feet really is, and just what fits in such a space. With the rear seats down my car purports to hold 70 cubit feet. However. when I’m loading up at the hardware store, all I care about is whether I can fit a few 2x4s inside.

(A 2×4 is a piece of standard piece of lumber, usually measuring 1-1/2″ x 3-1/2″ x 8′. The problem is that it’s eight feet long and shouldn’t bend, but should somehow fit diagonally. I can’t tell this from a brochure measurement. Only from real life. Of course my family is the one that brought an empty full-size cello case to see how it would fit, but that’s another story.)

As much as I think and dream about software performance, at the end of the day, any technical thoughts and results must be translated in to true customer-comprehensible meaning. “A batch of X actions complete in Y time” is exciting for the performance professional, but such measures have to be translated into meaningful customer statments.


CLM 5.0 performance documents and datasheets

CLM 5.0 was announced June 2, 2015, at the IBM Innovate 2014 conference in Orlando, FL. Lots of good things made it into the release. You can get all the details here.

Over on the deployment wiki we have published 11 — yes eleven! — datasheets detailing performance aspects of the CLM 5.0 release. Find them all here.

First the product-specific reports: Collaborative Lifecycle Management performance report: RTC 5.0 release compares RTC 5.0 against the prior release 4.0.6 to verify that there are no performance regressions in 5.0.


RTC 5.0 provides new Web caching technology that can improve application performance and scalability. The new technology stores cachable information locally on the client. Web caching performance improvements in Rational Team Concert 5.0 details the changes, and demonstrates the response time improvements from 8% to 2x improvement.

The Collaborative Lifecycle Management performance report: RDNG 5.0 compares RDNG 5.0 against the prior release 4.0.6 to verify that there are no performance regressions. Additionally, several user actions, such as opening a project, have become faster. Note that the RDNG 5.0 architecture is different from prior releases in that the RDNG repository is no separate from the JTS server.

Similarly, Collaborative Lifecycle Management performance report: Rational Quality Manager 5.0 release compares RQM 5.0 to the prior release 4.0.6 to verify that there are no performance regressions in 5.0. The results show that in some cases, page response times are slower.

The CLM reliability report: CLM 5.0 release puts the CLM applications together and runs them under load for seven days to evaluate their performance.


Rational Team Concert for z/OS Performance Improvement of Source Code Data Collection and Query in 5.0 shows improvements in source code data collection service and source code data query.

Since release 4.0.1 there have been gradual improvements in releases of RTC for z/OS. Rational Team Concert For z/OS Performance Comparison Between Releases details the improvements which include enterprise build time improving 45% from the 4.0.1 to 5.0 release.

Rational Team Concert Enterprise Extension zIIP Offload Performance in 5.0 documents how zIIP can offload application workload saving time and expense.

Enterprise Extensions promotion improvements in Rational Team Concert version 5.0 on z/OS compares ‘finalize build maps’ activity with the ‘publish build map links’ option selected between releases 4.0.6 and 5.0 where an improvement of 70% was observed.


In the reporting space, Collaborative Lifecycle Management performance report: Export Transform Load (ETL) 5.0 release shows that DM and Java ETL for 5.0 have similar throughput as 4.0.6. The RM part of DM and Java ETL have an approximate 8% improvement due to the performance optimization of the RM publish service.

CLM 5.0 introduces a new reporting technology and Collaborative Lifecycle Management performance report: Data Collection Component Performance (DCC) 5.0 release compares the old technology with the new.

For the particular topology and dataset tested:

  1. DCC performance has a significant improvement per the comparison of JAVA and DCC ETL based on performance test data. The duration reduced from 48 hours to 16 hours. For the duration of specific applications the DCC also has a significant improvement. It improved about 30% on QM loading, 20% on RM loading, 60% on CCM loading and 90% on Star job.
  2. DCC duration also has a significant improvement per the comarison of DM and DCC ETL based on performance test data. The duration reduced from 139 hours to 16 hours. The major improvements are the RRC ETL and RQM ETL. RQM loading improved about 60% and RRC loading improved about 85%.

I’m sure you’ll agree with me that a lot of good work went into CLM 5.0. If you have comments or questions on the reports, please use the comments box on the actual reports.


IBM Innovate 2013 is long gone. And so is IBM Innovate 2014

Some of my colleagues have been diligent and have already provided technical summaries of their IBM Innovate 2014 activities. I’m writing only a few days after the show has been packed up, and still think there’s plenty of time for that. This was the year I thought I’d make better use of social media. I had expected to be tweeting about the wonderful things I spied on the show floor, the cool stuff my colleagues were talking about and demonstrating, and maybe, just maybe, hype the sessions I took part in. It didn’t happen.

Innovate can be a whirlwind. We’re totally immersed on our topics, and we’ll enthusiastically chat with customers at every occasion. Ask me a performance question and you’ve got my undivided attention. There’s really little time for self-marketing. Well, for me there wasn’t, probably because I hadn’t made it a habit before. Yes, I could have preloaded TweetDeck (way before that XSS blip). I jokingly suspect most of my colleagues who were more active realtime had a few pre-written remarks ready to go. Maybe they were on their phones and iPads running into people. I mean, literally, running into people. Personally, it’s not a habit for me to switch gears so quickly, trying to be pithy for the semi-anonymous crowd, and then trying to be attentive to where I actually was.

I tried a few new strategies this year:

  •  I tried to listen better. Really, when someone was talking to me, I tried to pay better attention. Really I did.
  • I tried to take better care of my voice. This meant more tea + honey + lemon. The only problem is that I cut back on coffee which had a YAWN detrimental effect on my YAWN ability to stay awake as the YAWN day went on. At least I slept soundly on the flight home.


For many of us IBM Innovate is a family reunion. The first week in June might be the only time all year to see colleagues and customers we work with throughout the year, winter and summer, rain or shine. There are those I only see at Innovate even though we talk nearly weekly. And there’s always someone with whom I have collaborated with for years, but will meet for the first time at Innovate.

IBM Innovate usually coincides with my birthday. My family thinks I spend the week eating cake and lounging poolside, but given the long hours in the halls, all I get is a fluorescent tan. My pedometer surprised me by revealing that I climbed zero stairs (changes in elevation were achieved via elevators and escalators) and hardly walked at all. But for next year, all this is set to change. After many seasons in Orlando, Florida, Innovate moves west to Las Vegas, Nevada, February 22 to 26, 2015.

See you next year?