Let's start with a key question:
What is a performance test?
Performance tests try to reduce the risks of downtime or outages on multi-user systems by conducting experiments that use load to reveal limitations and errors in the system. Testing usually refers to assessing the performance and capacity of systems that were expensive and time-consuming to build.
Very few software projects are delivered early – and in my experience, never; so there are usually significant time pressures. The findings of a performance test inform tactical and strategic decisions that have even more at stake; the wrong decision to go live with a website or application could damage the financial results, brand, or even viability of the company.
The stakes for performance testing are high.
In a short period of time, we need to gather information to help advise stakeholders to make decisions that will affect businesses. As the performance tester, we have a responsibility to report reliable information about the systems we test.
All of the steps in global performance testing matter to successful projects and making good decisions. These steps include (but aren’t limited to):
- Developing Scripts
- Executing Tests
Interpreting these results and reporting them properly is where the value of an experienced performance engineer is proven.
Data needs analysis to become information.
Most load testing tools have some graphing capability, but you should not mistake graphs for reports. You don't want to send a canned report without properly analyzing the results. Graphs are just a tool. The most effective way to use graphs is to aid in visualization to guide stakeholders in consuming actionable information.
As an aside, here’s an example of a graph showing how averages lie. Good visualizations help expose how data can be misleading.
The performance tester should form hypotheses, draw tentative conclusions, determine what information is needed to confirm or disprove their conclusions, and prepare key visualizations that give insight on system performance, bottlenecks, and support the narrative of the report.
Some of the skills necessary for doing this are understanding the following:
- Hard and Soft Resources
- Garbage Collection Algorithms
- Database Performance
- Message Bus Characteristics
- Other Components of Complex Systems
Understanding that a system slows down after a certain load is surpassed is valuable information. Understanding the limiting resource (ie. the reason for the system slowing down) is actionable information. Learning to recognize patterns can take years to acquire, and is an ongoing and changing process.
Other skills are socio-political in nature.
A large part of reporting is knowing how to talk to stakeholders. These are the things that are important to think about:
- Who needs to know these results?
- What do they need to know?
- How do they want to be told?
- How can we form and share the narrative so that everyone on the team can make good decisions that will help us all succeed?
It is our job to guide stakeholders by 1) revealing information, 2) identifying actionable items, and 3) turn our findings into a solid plan.
How can you grow these skills?
The good news is that you don’t have to do this all by yourself. The subject matter experts you are working with – Developers, Operations, DBAs, help desk techs, business stakeholders, and your other teammates — all have information that can help you unlock the full value of a performance test.
This is a complex process and can be difficult to teach. In order to address this challenge, my former consulting partner and mentor Dan Downing came up with a six-step process called CAVIAR which stands for:
1. Collecting: gathering results from tests to help weigh the validity of the results.
Are there errors? What kind, and when did they occur? What are the patterns? Can you get error logs from the application?
One important component of collecting is granularity. Measurements from every few seconds can help you spot trends and transient conditions. One tutorial attendee shared how he asked for access to monitor servers during a test and was instead sent resource data with five-minute granularity.
2. Aggregating: summarizing measurements using various levels of granularity to provide tree and forest views, and comparing with consistent granularities.
Another component of proper reporting is measuring results with meaningful statistics like scatter plots, data ranges, variance, percentiles, and other ways of examining the distribution of data. Reporting is more accurate when multiple metrics are used to "triangulate" or confirm the hypotheses.
3. Visualizing: graphing key indicators to help understand what occurred during the test.
Here are some key comparisons to start with:
- Errors vs. Load (“results valid?”)
- Bandwidth throughput over Load (“system bottleneck?”)
- Response Time vs. Load (“how does system scale?”)
- Business process end-to-end
- Page level (min-avg-max-SD-90th percentile)
- System resources (“how’s the infrastructure capacity?”)
- Server CPU vs. Load
- JVM heap memory/GC
- DB lock contention, I/O Latency
4. Interpreting: drawing conclusions from observations and hypotheses.
Interpreting data requires you to evaluate your data and test your hypotheses:
- Make objective, quantitative observations from graphs and data. What can you observe from this data?
- Compare your observations. Where are the consistencies in your observations? Where are the inconsistencies?
- Develop hypotheses based on your observations.
- Test your hypotheses.
- Turn validated hypotheses into conclusions: “From observations A and B, corroborated by C, I conclude that…”
5. Assessing: checking where objectives were met and deciding what action to take as a result.
Determine remediation options at the appropriate level (business, middleware, application, infrastructure, network) and retest.
At this stage, you should generate recommendations that are specific and actionable at a business or technical level. It never hurts to involve more people to make sure your findings add up.
The benefits, costs, and risks of your recommendations should be as transparent as possible. Remember that a tester illuminates and describes the situation, but the final outcome is up to the judgment of your stakeholders, not you. If you provide good information and well-supported recommendations, you’ve done your job.
6. Reporting: aggregating and presenting your recommendations, risks, costs, and limitations.
Note the “-ing”. The process of reporting is much more than dropping a massive report into an email and walking away.
A report, whether a written report, presentation of results, email summary, or an oral report, should be written in one of the following formats.
- A short elevator summary
- A three-paragraph email
- A narrative
These are the report formats that people prefer to consume, so it's worthwhile to spend time on getting these right. Doing this well means that you should write the report yourself rather than risking the errors that often stem from a third party interpreting your work.
Good reporting conveys recommendations in stakeholders’ terms. You should identify the audience(s) for the report, and write and talk in their language. What are the things you need to convey? What information is needed to support these things?
How to write a test report
A written report is still usually the key deliverable, even if most people won’t read it (and fewer will read the whole report).
One way to construct the written report might be like this:
1. Executive Summary (3 pages max, 2 is better)
- The primary audience is usually executive sponsors and the business; write the summary at the front of the report for them.
- Keep language simple, and avoid acronyms and jargon. If you need to use complicated terms, fully explain them in the report.
- Include only pertinent details. If it doesn't relate to the audience, you do not need to include it.
- Correlate recommendations to business objectives.
- Summarize objectives, approach, target load, and acceptance criteria.
- Cite factual observations.
- Draw conclusions based on observations.
- Make actionable recommendations.
2. Supporting Detail
- A Supporting Detail document should include all of the information necessary for repeating your tests.
- Include rich technical detail like observations and annotated graphs.
- Include feedback from technical teams, quoted accurately.
- Include test parameters (date/time executed, business processes, load ramp, think-times, hardware configuration, software versions/builds, etc.).
- Consider sections for errors, throughput, scalability, and capacity.
- In each section: annotated graphs, observations, conclusions, recommendations.
3. Associated Documents (if appropriate)
Associated documents should include a full set of graphs, workflow detail, scripts, test assets, at the end of the report to document what was done.
When thinking about what to include, think about the following questions. Who is the audience? Why would they want to see 50 graphs and 20 tables? What will they be able to see?
Remember: Data + Analysis = Information
4. Present the Results
The best presentations usually consist of about 5-10 slides with visual aids and little text (graphs, bullets, etc.). Be as clear as possible in your explanation of your recommendations, describing the risks, and the benefits of each solution.
Caveats and takeaways
This methodology isn’t appropriate for every context. Your project may be small, or you may have a charter to run a single test and report to only a technical audience. There are other reasons to decide to do things differently in your project, and that’s OK. Keep in mind that your expertise as a performance tester is what turns numbers into actionable information.
For more on diagnosing performance problems, check out the Solve Front-End Performance Problems in Real Time With mPulse blog post.
For more on improving the overall performance of your site, check out the Adaptive Acceleration blog post.