Blog

The Next Step in Web Performance ROI Calculations: What-If Analysis v3

November 20, 2018 · by Simon Hearne ·
Categories:

The Akamai 2018 October Release includes a major change to Akamai mPulse: a new methodology for calculating predicted conversion rates.

The What-If analysis feature has been a key part of mPulse for many years and has helped countless organizations predict future conversion rates and revenues as their website speed improves.

What if dashboard, v2
The What-If v2 dashboard in Akamai mPulse

What-If v2 allows you to modify projected revenue, conversion rate, and session length as well as the average session load time. Since What-If v2 was released, we’ve been hard at work adding new business metrics and site-speed timers. However, What-If v2 doesn’t support the full range of metrics and timers, nor does it support creating predictions across dimensions such as page groups or device types. We saw this as an opportunity to start developing a new version of What-If v3 from scratch, essentially creating a What-If analysis for anything.

The Science

The ideal method to determine the impact of site speed on business metrics would be through A/B testing: we would measure user experience and business metrics (using a solution such as mPulse) in controlled conditions while slowing down the speed of web pages. Unfortunately, it has historically been hard to convince organizations to slow down their sites to see what happens, but there are many case studies showing that improving speed has multiple benefits. WPO Stats is an open-source collection of some of these case studies and can be a great reference when selling speed to a business. If only it were easy to improve site speed and measure the impact.

The web is variable. While we normally see this variability as a bad thing, the What-If algorithm relies on it to build a statistical model of site speed and its impact on user behavior. This data can give us a model with very high confidence, given enough time and user sessions. The new What-If analysis requires at least three days of data to achieve this confidence level, and we recommend using at least four weeks if possible.

The following eight sections will walk you through the new What-If algorithm.

1. Extracting performance distributions

The first step in the process is to summarize the real data that mPulse has collected in the given time range. We summarize the performance distributions as cumulative distribution functions (CDFs) and probability distribution functions (PDFs). We extract one CDF / PDF pair for each day in the time range and combine CDFs which have the same median load time (based on the selected page load timer) and extract the best, worst and middle cases for further analysis.

extracted CDFs
Figure 1. Three extracted CDFs and PDFs (best, worst, middle) from a set of customer data

2. Mapping the performance distribution to a goal speed

We then create a function to map the middle CDF to any given median load time, using the best, worst and middle CDFs we generated in step one. This function is used to generate a CDF which has a median load time which should be equal to the goal speed provided.

predicted distribution
Figure 2. Creating a predicted distribution for goal speed of 1,624ms with a result of 1,636ms (0.74% error)

3. Validating the new distribution

The new CDF is first fixed up to ensure that the numbers are within bounds and that the CDF maintains a monotonically increasing nature (i.e. each bucket is greater than or equal to the previous bucket). The CDF is also scaled to ensure that the maximum value is equal to 1.0. It’s possible that taking these steps could move the median value away from the goal speed, so we now verify that the speed is within a small margin of the goal speed.

We then compare the generated CDF against a real CDF generated in step one using the Kolmogorov-Smirnov two-sample test. Once we know that our generated CDF produces the desired goal speed and is a good match to a CDF from the real data (KS-statistic < 1.0) we can use it to produce a new distribution and map your business metrics.

Calcualting KS-statistic
Figure 3. Calculating the KS-statistic (0.0354) to compare predicted distribution with 200ms improvement against the middle observed distribution

4. Mapping business metrics to the new distribution

Now we’ve generated and validated our CDF, we use it to generate the full session distribution for the next processing steps. We take the generated distribution and compare it to the middle distribution from step one. Iterating through each bucket in the distributions we shuffle the sessions from the middle distribution to fit the generated one. This allows us to predict what the business outcome (such as conversion rate) would be for each bucket, based on real user sessions. The business outcomes are then summed or averaged over the whole distribution to produce the summary metrics presented in the report.

5. Estimating the ROI of improvement

The new What-If algorithm provides suggested goal speeds for each metric. These goals indicate the page speed that gives the best-predicted return for the given metric. We calculate these goals by simulating improving site speeds until the point where the metric does not improve at an accelerating rate, i.e. getting faster does not mean the metric continues to improve faster than at the previous speed. While the metric will (most probably) continue to improve as speed improves, the goal speed indicates the point at which the returns start to diminish.

determining best goal speed
Figure 5. Determining the best goal speed by incremental impact

6. Validating the business metrics

We’re telling you how much your business metrics will improve, so we want to be sure that we’re as accurate as possible. In order to do this, we generate metrics for a set of predicted goal speeds. These metrics are also calculated from the real data we used in step one. We then generate line graphs of the business metrics against load time and ensure that the y-intercept and gradient of the line for the predicted metric is within 5% of those of the line for the observed data.

converstion rate trends
Figure 6. Comparing conversion rate trends to validate predictions against observed data points

7. What-If across a dimension

The steps outlined above allow us to predict business metrics for a given set of data. In mPulse you can also add a “Group By” filter to What-If, this does something special: for the provided dimension (for example A/B Test, Device Type, Country or Pagegroup) we run the analysis across up to ten groups of sessions (grouped by the provided dimension), as well as the overall data set. This allows you to modify the session groups independently while generating the What-If predictions across the full set of sessions.

8. Presenting the results

Now for the fun part! Steps one through six give us a set of goal speeds and goal metrics. We can use this data to predict business metrics for any (reasonable) goal speed using linear regression. The new UI presents this data up-front, so you can see exactly how the metrics vary with speed. We also show all of the available metrics in the UI, including conversion rate and revenue if those metrics were added to the widget or dashboard filters in mPulse.

Getting started

In order to use What-If v3, you’ll need an mPulse Enterprise account. You can access the What-If dashboard by selecting it from your mPulse Home page, or by adding the What-If v3.0 widget to a dashboard from the widget sidebar.

widget type

There are three sections in the What-If v3 user interface in mPulse:

1. Summary

The first screen shows a summary of your data from the given time period, including key metrics and the iconic sessions vs conversion distribution chart. The initial view is of The Reality, showing the key metrics as measured in mPulse. The second view, The Possibility, shows predicted metrics given a recommended goal speed.

reality vs. possibility

2. Load Time & Metrics

The second screen is Load Time & Metrics. This view shows a set of charts which present the relationship between your business metrics and your chosen site speed timer. The charts share a common speed x-axis which starts at the lowest / fastest speed on the left.

A goal speed will initially be selected for you based on the closest metric goal speed to the observed average speed. The load time slider will be set at this value, and a Goal Speed marker will echo the value on the charts.

There are two methods to modify the speed and metric values: each metric has its own slider which can be adjusted between the minimum and maximum values in the data. The metric can also be entered manually by clicking on the value on the right-hand side, this is useful for setting a precise goal value.

Each metric has an information tooltip which provides a little extra information on how the metric value is calculated.

slider

3. Dimension Breakdown

The final screen is optional and is only displayed if a Group By filter has been set on the dashboard. For each of the top ten dimension values (there may be fewer, e.g. Device Type will result in three values: Mobile, Tablet & Desktop), a graph is rendered showing the actual and projected distributions for sessions and conversion rate (where configured). The dimension values are ordered by popularity, groups with the greater number of sessions will render on top.

Individual load time sliders can be used to adjust each group independently of the others. The summary metrics at the top of the screen will update in real-time to show the impact on the whole site.

Click the Reset button at the bottom of the screen to set all group speeds automatically from the goal speed that has been set in the Load Time & Metrics screen; conversely, the goal speed for the site is updated every time an individual group is updated. You can also download an Excel file of the results, broken down by the chosen dimension.

summary metrics

Conclusion

The new What-If dashboard has been re-written from scratch to support multiple new metrics and timers, a new breakdown section has been added for improved flexibility and granularity.

The data that is used to generate the new What-If report is processed by our Data Science pipeline, which includes additional filtering for improved accuracy. The algorithms that make the predictions have also been re-written from the ground up, this means that the results will not be exactly the same as the previous What-If dashboard, because we’re continuously improving our data cleansing and calculations. This processing also takes a little more time than the v2 dashboard, and so we send a status notification to keep you updated as we process the data and generate the results.

Simon Hearne is a principal software engineer at Akamai Technologies.