Looking ahead to InterConnect 2015

InterConnect 2015, in Las Vegas, NV, is a week away. The official conference starts on Monday, February 23, and pre-meetings and whatnot commence the day before on Sunday, February 24.

I’m looking forward to being there. One reason is because I’m sitting in a Boston suburb wearing extra layers including one of my thickest wool sweaters, sheepskin slippers and a scarf. I rarely ever wear a scarf in the house, but it is so cold. Maybe you saw in the news that we’re having a little problem with snowfall this year. Colleagues comfort me by pointing out how unseasonably warm it is for them in the great states of Washington, Texas, Florida and Colorado, to name a few. They mention two-digit temperatures well above freezing which have become rarity ’round here. But no one wants to hear someone complain about seven-foot snow drifts, so let me get back on topic.

snap1670

My main-tent session this year is DRA-2104: Performance, Monitoring, and Capacity Planning for the Rational CLM Solution. It’s on Monday, Feb. 23 at 3:30 pm in the Islander Ballroom B at Mandalay Bay. Because this is the first year the formerly-known-as-Rational-Innovate-conference has become part of a larger conference and moved from Orlando, FL, to Las Vegas, NV, I will get there early. Because I don’t know where anything is. And the helpful messages from the conference management folks suggest that it takes at least 30 minutes to get from site to site.

DRA-2104: Performance, Monitoring, and Capacity Planning for the Rational CLM Solution will talk about all the awesome performance improvements that are in the CLM 5.0.2, 5.0.1 and 5.x releases. Some improvements made back in 4.x are so good, they will bear mentioning again.

I’ll also talk about our CLM Sizing Strategy and how proper monitoring of an existing system can lead towards an understanding of capacity planning. There will be time for questions, and discussions about the local weather.

My presentation is part of the larger Rational Deployment for Administrators track. If you are attending InterConnect and can login to the event portal, https://ibm.biz/BdEC4B will take you to the entire track schedule.

We made slides for ourselves to cross-promote each others sessions:

snap1671

snap1672

On Wednesday, I’ll be moderating DRA-1970A: Best Practices for Using IBM Installation Manager in the Enterprise which has a great line-up of Installation Manager practitioners who are dying to share their experiences.

It’s time for me to get outside and shovel a bit. I hope to see you in Vegas next week. Be sure to say “Hi!”

 

JTSMon 5.0.2 is here

JTSMon 5.0.2 is now available for downloading from the Deployment Wiki FAQ site (https://jazz.net/wiki/bin/view/Deployment/JTSMonFAQ).

  • Some facts to note with this new release:
    The appearance in CLM 5.0.2 of “scenario” (client-side use-case) based web service reporting will cause problems with earlier releases of JTSMon. The new version reads the new format reports accurately though it does not yet take advantage of scenario data.
  • A new 5.0.2 jazzdev baseline is included for comparison to user collected data.
  • RQM web service data is better broken down now for system component reporting.
  • Post-monitoring analysis can now be focused on a subset of the collected data.
  • Several additional defects have been fixed.

If you have questions or comments, please ask them at the jazz.net forum. We’re using the jazzmon tag there.

CLM 5.0.1 datasheets

Collaborative Lifecycle Management (CLM) 5.0.1 was announced and released this week. For Rational Team Concert (RTC) 5.0.1 there were substantial improvements made to the Plan Loading capability which are outlined in Plan Loading Performance Improvements in RTC 5.0.1.

Other 5.0.1 datasheets include:

 

For the CLM 5.0.1 release, performance testing validated that there were no regressions for RTC and RQM between the 5.0 and 5.0.1 releases, therefore there has not been a need to create 5.0.1 datasheets for those products. For 5.0.1 performance information for those products please consult the 5.0 datasheets.

Two new CLM 5.0 datasheets on the jazz.net Deployment wiki

A quick note about two recent performance datasheets and sizing guidelines published on the Jazz.net Deployment wiki:

Just posted is the Rational Engineering Lifecycle Manager (RELM) performance report 5.0 by Aanjan Hari. This article presents the results of the team’s performance testing for the RELM 5.0 release and deployment suggestions derived from the tests.

snap1099

Also recently posted: Sizing and tuning guide for Rational DOORS Next Generation 5.0 by Balakrishna Kolla. This article offers guidance based upon the results of performance tests that the team ran on several hardware and software configurations.

 

Field Notes: Getting Started With Performance Testing, Step 734

Sometimes I’m fortunate to work with customers and teams who are interested in setting up performance testing, either of their own products or ours. If they have little experience, they may be unsure of how to begin. Usually these conversations start with basic project management questions:

  • What are the goals?
  • What is the timeframe?
  • What are your skills?
  • What equipment can you use?
  • What is the budget?
  • Etc.

Recently I was working with a group that was well on their way towards performance testing nirvana. (Such a thing exists. Some of us have seen it. It’s blue at the edges.) This group and I had a different type of conversation which centered around how to ensure that the great work they were doing would be relevant and meaningful to their customers.

I think we all have examples of this: We show a detailed graph to someone and they can only comment on the colors, completely missing the revolutionary earth shattering facts you’re trying to prove.

This team I spoke with is a very technical team. They have a very technical product. It’s not deployed causally, and folks who use it are well-aware they’re making an investment in setting it up and getting results from it.

Right now they’re working towards a set of automated performance tests (Yay!) and a corresponding set of expected result ranges. The test is very specific: “A batch of X actions complete in Y time.” This is useful for the team because if the results vary (Y increases or decreases) they know something has changed. Right now this test is useful for them.

I suggested that they consider trying to restate this test in the context of what it might mean to a customer. When might a customer have X of these actions? Would this be a customer just starting out, or a group well into the tool’s adoption? Is this a normal task, or might it be something done infrequently? Why should this test and the results give a customer confidence?

A car manufacturer may giggle with delight that a particular component rotates at 74,000 rpm, but the consumer might care more that the component contributes to overall reliability and that the car will start and stop when required.

(We know car makers have those glossy brochures with torque specs and so forth, but really, how many of us, the first thing we ask, is what colors can I get?)

Back to our technical example, “A batch of X actions complete in Y time” could be expressed as: “50 people’s builds execute in less than one hour,” or “The average work produced by a development team of 50 engineers can run overnight.” Etc.

I also suggested that now would be a good time to capture as much detail about the circumstances of the tests they’re doing: the hardware, how much data was moved back and forth, average CPU percentage consumption, maximum memory consumption.

There may be a point at which they may have to to extrapolate that test’s behavior to another environment (“Will this different hardware and different deployment conditions still achieve X actions completing in Y time?”) and having all the extra data might mean not having to rerun a test. Or perhaps it might identify another dimension to this particular successful test that was overlooked.

As much as I suggested defining a set of measures for the test, I warned against forgetting how the customer might perceive the test.

Another car example: Cargo area is most always indicated in those glossy brochures. While we can tell that 70 cubic feet is bigger than 68 cubic feet, the actual measure of cubic feet is not something most of us encounter frequently enough to know just how big 70 cubic feet really is, and just what fits in such a space. With the rear seats down my car purports to hold 70 cubit feet. However. when I’m loading up at the hardware store, all I care about is whether I can fit a few 2x4s inside.

(A 2×4 is a piece of standard piece of lumber, usually measuring 1-1/2″ x 3-1/2″ x 8′. The problem is that it’s eight feet long and shouldn’t bend, but should somehow fit diagonally. I can’t tell this from a brochure measurement. Only from real life. Of course my family is the one that brought an empty full-size cello case to see how it would fit, but that’s another story.)

As much as I think and dream about software performance, at the end of the day, any technical thoughts and results must be translated in to true customer-comprehensible meaning. “A batch of X actions complete in Y time” is exciting for the performance professional, but such measures have to be translated into meaningful customer statments.

 

CLM 5.0 performance documents and datasheets

CLM 5.0 was announced June 2, 2015, at the IBM Innovate 2014 conference in Orlando, FL. Lots of good things made it into the release. You can get all the details here.

Over on the deployment wiki we have published 11 — yes eleven! — datasheets detailing performance aspects of the CLM 5.0 release. Find them all here.

First the product-specific reports: Collaborative Lifecycle Management performance report: RTC 5.0 release compares RTC 5.0 against the prior release 4.0.6 to verify that there are no performance regressions in 5.0.

50webimprovwide

RTC 5.0 provides new Web caching technology that can improve application performance and scalability. The new technology stores cachable information locally on the client. Web caching performance improvements in Rational Team Concert 5.0 details the changes, and demonstrates the response time improvements from 8% to 2x improvement.

The Collaborative Lifecycle Management performance report: RDNG 5.0 compares RDNG 5.0 against the prior release 4.0.6 to verify that there are no performance regressions. Additionally, several user actions, such as opening a project, have become faster. Note that the RDNG 5.0 architecture is different from prior releases in that the RDNG repository is no separate from the JTS server.

Similarly, Collaborative Lifecycle Management performance report: Rational Quality Manager 5.0 release compares RQM 5.0 to the prior release 4.0.6 to verify that there are no performance regressions in 5.0. The results show that in some cases, page response times are slower.

The CLM reliability report: CLM 5.0 release puts the CLM applications together and runs them under load for seven days to evaluate their performance.

50zcollectingbuildablefiles

Rational Team Concert for z/OS Performance Improvement of Source Code Data Collection and Query in 5.0 shows improvements in source code data collection service and source code data query.

Since release 4.0.1 there have been gradual improvements in releases of RTC for z/OS. Rational Team Concert For z/OS Performance Comparison Between Releases details the improvements which include enterprise build time improving 45% from the 4.0.1 to 5.0 release.

Rational Team Concert Enterprise Extension zIIP Offload Performance in 5.0 documents how zIIP can offload application workload saving time and expense.

Enterprise Extensions promotion improvements in Rational Team Concert version 5.0 on z/OS compares ‘finalize build maps’ activity with the ‘publish build map links’ option selected between releases 4.0.6 and 5.0 where an improvement of 70% was observed.

dcc5

In the reporting space, Collaborative Lifecycle Management performance report: Export Transform Load (ETL) 5.0 release shows that DM and Java ETL for 5.0 have similar throughput as 4.0.6. The RM part of DM and Java ETL have an approximate 8% improvement due to the performance optimization of the RM publish service.

CLM 5.0 introduces a new reporting technology and Collaborative Lifecycle Management performance report: Data Collection Component Performance (DCC) 5.0 release compares the old technology with the new.

For the particular topology and dataset tested:

  1. DCC performance has a significant improvement per the comparison of JAVA and DCC ETL based on performance test data. The duration reduced from 48 hours to 16 hours. For the duration of specific applications the DCC also has a significant improvement. It improved about 30% on QM loading, 20% on RM loading, 60% on CCM loading and 90% on Star job.
  2. DCC duration also has a significant improvement per the comarison of DM and DCC ETL based on performance test data. The duration reduced from 139 hours to 16 hours. The major improvements are the RRC ETL and RQM ETL. RQM loading improved about 60% and RRC loading improved about 85%.

I’m sure you’ll agree with me that a lot of good work went into CLM 5.0. If you have comments or questions on the reports, please use the comments box on the actual reports.

 

IBM Innovate 2013 is long gone. And so is IBM Innovate 2014

Some of my colleagues have been diligent and have already provided technical summaries of their IBM Innovate 2014 activities. I’m writing only a few days after the show has been packed up, and still think there’s plenty of time for that. This was the year I thought I’d make better use of social media. I had expected to be tweeting about the wonderful things I spied on the show floor, the cool stuff my colleagues were talking about and demonstrating, and maybe, just maybe, hype the sessions I took part in. It didn’t happen.

Innovate can be a whirlwind. We’re totally immersed on our topics, and we’ll enthusiastically chat with customers at every occasion. Ask me a performance question and you’ve got my undivided attention. There’s really little time for self-marketing. Well, for me there wasn’t, probably because I hadn’t made it a habit before. Yes, I could have preloaded TweetDeck (way before that XSS blip). I jokingly suspect most of my colleagues who were more active realtime had a few pre-written remarks ready to go. Maybe they were on their phones and iPads running into people. I mean, literally, running into people. Personally, it’s not a habit for me to switch gears so quickly, trying to be pithy for the semi-anonymous crowd, and then trying to be attentive to where I actually was.

I tried a few new strategies this year:

  •  I tried to listen better. Really, when someone was talking to me, I tried to pay better attention. Really I did.
  • I tried to take better care of my voice. This meant more tea + honey + lemon. The only problem is that I cut back on coffee which had a YAWN detrimental effect on my YAWN ability to stay awake as the YAWN day went on. At least I slept soundly on the flight home.

innovate_pins

For many of us IBM Innovate is a family reunion. The first week in June might be the only time all year to see colleagues and customers we work with throughout the year, winter and summer, rain or shine. There are those I only see at Innovate even though we talk nearly weekly. And there’s always someone with whom I have collaborated with for years, but will meet for the first time at Innovate.

IBM Innovate usually coincides with my birthday. My family thinks I spend the week eating cake and lounging poolside, but given the long hours in the halls, all I get is a fluorescent tan. My pedometer surprised me by revealing that I climbed zero stairs (changes in elevation were achieved via elevators and escalators) and hardly walked at all. But for next year, all this is set to change. After many seasons in Orlando, Florida, Innovate moves west to Las Vegas, Nevada, February 22 to 26, 2015.

See you next year?

 

Software sizing isn’t easy

I’m going to quote pretty much the entirety of an introduction I wrote to an article just posted at the jazz.net Deployment wiki on CLM Sizing (https://jazz.net/wiki/bin/view/Deployment/CLMSizingStrategy):

Whether new users or seasoned experts, customers using IBM Jazz products all want the same thing: They want to use the Jazz products without worrying that their deployment implementation will slow them down, that it will keep up with them as they add users and grow. A frequent question we hear, whether it’s from a new administrator setting up Collaborative Lifecycle Management (CLM) for the first time, or an experienced administrator tuning their Systems and Software Engineering (SSE) toolset, is “How many users will my environment support?”

Back when Rational Team Concert (RTC) was in its infancy we built a comprehensive performance test environment based on what we thought was a representative workload. It was in fact based upon the workload the RTC and Jazz teams itself used to develop the product. We published what we learned in our first Sizing Guide. Later sizing guides include: Collaborative Lifecycle Management 2011 Sizing Guide and Collaborative Lifecycle Management 2012 Sizing Report (Standard Topology E1). As features were added and the release grew, we started to hear about what folks were doing in the field. The Jazz products, RTC especially, are so flexible that customers were using them with wonderfully different workloads than we had anticipated.

Consequently, we stepped back from proclaiming a one-size fits all approach, and moved to presenting case studies and specific test reports about the user workload simulations and the loads we tested. We have published these reports on the jazz.net Deployment wiki at Performance datasheets. We have tried to make a distinction between performance reports and sizing guides. Performance reports document a specific test with defined hardware, datashape and workload, whereas sizing guides suggest patterns or categories of hardware, datashape and workload. Sizing reports are not specific and general descriptions of topologies and estimations of workloads they may support.

Throughout the many 4.0.x release cycles, we were still asked “How many users will my environment support?” Our reluctance to answer this apparently straightforward question frustrated customers new and old. Everyone thinks that as the Jazz experts we should know how to size our products. Finally, after some analysis and messing up countless whiteboards, we would like to present some sizing strategies and advice for the front-end applications in the Jazz platform: Rational Team Concert (RTC), Rational Requirements Composer (RRC)/Rational DOORS Next Generation (DNG) and Rational Quality Manager (RQM). These recommendations are based upon our product testing and analysis of customer deployments.

sizingestact_wide

The article talks about how complex estimating a software sizing can be. Besides the obligatory disclaimer, there’s a pointer to the CLM and SEE recommended topologies and a discussion of basic definitions. There’s also a table listing many of the non-product (or non-functional) factors which can wreck havoc with the ideal performance of a software deployment.

Most importantly, the article provides some user sizing basics for Rational Team Concert (RTC), Rational Requirements Composer (RRC)/Rational DOORS Next Generation (DNG) and Rational Quality Manager (RQM). Eventually we’ll talk a bit more about the strategies / concepts needed to determine whether you may need two CCMs or multiple application servers in your environment.

For now, I hope we’re taking a good step towards answering the perennial question: “How many users will my environment support,” and explaining why it’s so hard to answer that question accurately.

As always, comments and questions are appreciated.

CLM 4.0.6 performance reports

A fresh batch of 4.0.6 datasheets was posted to the deployment wiki coincident with the 4.0.6 release. 4.0.6 was released February 28, 2014. Yes, that was a month or so ago, and so I’m late with mentioning the timely performance reports, our largest batch yet. The team worked hard to get the data and compile the reports.

For those keen on migrating data from ClearCase to Rational Team Concert, the ClearCase Version Importer performance report: Rational Team Concert 4.0.6 release report shows basic data on how long an import may take.

The Collaborative Lifecycle Management performance report: RTC 4.0.6 release shows that 4.0.6 RTC performance is comparable to 4.0.5.

zosbuildnos

For the RTC for z/OS 4.0.6 release, the Rational Team Concert for z/OS Performance in 4.0.6 report  shows that 4.0.6 performance is similar to 4.0.5. For 4.0.6 There were enhancements made to RTC for z/OS queries so that they use the Java JFS API: “In the scenario of the incremental build (request build after changing all the copybook files), the “Collecting buildable files” activity in preprocessing time improved about 25%, and result in an about 5% improvement in total run time.”

The Collaborative Lifecycle Management performance report: RRC 4.0.6 release report shows that RRC 4.0.6 performance is comparable to 4.0.5.

Similarly, the Collaborative Lifecycle Management performance report: Rational Quality Manager 4.0.6 release shows that RQM 4.0.6 performance is comparable to 4.0.5.

The CLM Reliability report for the 4.0.6 release demonstrates the capability of a Standard Topology (E1) configuration under sustained 7-day load.

The Collaborative Lifecycle Management performance report: Export Transform Load (ETL) 4.0.6 release report demonstrates that there are no performance regressions in 4.0.6 ETL performance compared to 4.0.5. The 4.0.6 ETL functionality is more comprehensive than it was for 4.0.5, and so there are situations where ETL times may have increased although the data now indexed is more complete and accurate.

Comments, questions? Use the comments box on the actual reports themselves.

 

Field notes: It isn’t always about virtualization, except when it is

Talking recently with some customers, we discussed the fallacy of always trying to solve a new problem with the same method that found the solution of the proceeding problem. Some might recall the adage, “If you only have a hammer, every problem looks like a nail.”

hammer_1

I am often asked to help comprehend complex performance problems our customers encounter. I don’t always have the right answer, and usually by the time the problem gets to me, a lot of good folks who are trained in solving problems have spent a lot of time trying to sort things out. I can generally be counted upon for a big-picture perspective, some non-product ideas and a few other names of folks to ask once I’ve proved to be of no help.

A recent problem appeared to have no clear solution. The problem was easy to repeat and thus demonstrable. Logs didn’t look out of the ordinary. Servers didn’t appear to be under load. Yet transactions that should be fast, say well under 10 seconds, were taking on the order of minutes. Some hands-on testing had determined that slowness decreased proportionally with the number of users attempting to do work (a single user executing a task took 30-60 seconds, two users at the same time took 1 to 90 seconds, three users took 2-3 minutes, etc.).

So I asked whether the environment used virtualized infrastructure, and if so, could we take a peek at the settings.

Yes, the environment was virtualized. No, they hadn’t looked into that yet. But yes, they would. It would take a day or too to reach the folks who could answer those questions and explain to them why we were asking.

But we never did get to ask them those questions. Their virtualization folks took a peek at the environment and discovered that the entire configuration of five servers was sharing the processing power customarily allocated to a single server. All five servers were sharing 4 GHz of processing power. They increased the resource pool to 60 GHz and the problem evaporated.

I can’t take credit for solving that one. It was simply a matter of time before someone else would have stepped back and asked the same questions. However, I did write it up for the deployment wiki. And I got to mention it here.