Blog

Improve Performance and SEO by Tuning for Crawlers

February 25, 2020 · by Tim Vereecke ·

A little background on search engine crawlers:

You may know how search engine crawlers work, but you might not know how to help crawlers get the information they need faster - and, as a result, improve SEO results. 

In case you’re not familiar, search engine crawlers collect a list of URLs from your site, cataloging dead links and changes to information architecture. Crawling helps create an index of old and new content on the internet for search engines to use, and so your relevant pages can be discovered.

Improving performance on your site is the key to maximizing your crawl budget. Simply put (very simply), crawl budget refers to the pages per time that a crawler can index your site’s pages. 

In this article, we’ll walk you through how you can use Akamai Bot Manager to improve performance and get your updates faster in SEO results. 

Now let’s dive into how you can add intelligence at the Edge to serve HTML as fast as possible to the crawlers you care about. 

This post will cover how to:

  1. Use the Akamai Bot Manager to detect and classify bots at the Edge.

  2. Toggle features designed for real users but suboptimal for bots.

  3. Increase insights into your bot traffic.

Step 1: Select which bots you want to manage in Property Manager

To use Bot Manager inside of Property Manager you follow these 4 simple steps:

  1. Create a custom variable: {{PMUSER_TRIGGERED_RULES}} and set it to “Hidden.”

  2. Assign the built-in variable: {{builtin.AK_FIREWALL_TRIGGERED_RULES}} to your custom variable.
     

    set variable

  3. Use the custom variable to trigger rules based on the Rule ID corresponding to your bots of choice.
     

    Criteria matching rules 3991006 (Search engine bots) and 3991011 (E-Commerce Search engine bots).
    Criteria matching rules 3991006 (Search engine bots) and 3991011 (E-Commerce Search engine bots).

     

  4. Select “Metadata Stage” in the dropdown and set it to "client-request" in order to work in both Bot Manager Standard and Bot Manager Premier. 
     

    variable values

     

Here are some common examples for related bots:

  • 3991005 Site Monitoring and Web Development Bots

  • 3991006 Web Search Engine Bots

  • 3991008 SEO, Analytics or Marketing Bots

  • 3991009 Social Media or Blog Bots

  • 3991010 Online Advertising Bots

  • 3991011 E-Commerce Search Engine Bots

  • 3991018 News Aggregator Bots

Step 2: Toggle features to improve crawler performance

Once we identify the bots, we can use this to toggle certain features that are suboptimal for crawler performance.

features and toggles
All bot related feature toggles are grouped in Property manager

Disable prefetching

Prefetching delays the TTFB in favor of faster-embedded resources. This is great for long-tail content where embedded resources are not at the Edge and need to be requested from the origin. 

Although search engine crawlers generate a lot of traffic for long-tail content they typically do not download the images straight away (e.g. Google has a separate crawler for images).

Disabling prefetching can have a positive impact on performance, reduce origin load and reduce your traffic.

prefetch

Disable mPulse Edge injection

Adding mPulse (lite or enterprise)  at the Edge gives plenty of additional visibility. Crawlers however do not send beacons to mPulse and therefore it can be removed. In case they do send beacons in the future, disabling RUM for bot traffic keeps your data clean.

Disabling mPulse Edge Injection for the identified bots reduces impact on TTFB.

mPulse

Disable Adaptive Acceleration

Adaptive Acceleration offers great benefits to improve the rendering time in the browser. For the bots crawling a website this does not add any benefits and can therefore be disabled.

Adaptive Acceleration

Improve Bot traffic visibility

BotMan out of the box gives you plenty of traffic insights. Assigning bot traffic to a specific CPcode gives some extra insights which allow you to validate the quality of service from a Technical SEO perspective such as:

  1. Redirects seen by crawlers

  2. 404’s seen by crawlers

  3. Errors seen by crawlers

  4. Cache hits seen by crawlers

Content Provider Code
Setup 1 or more dedicated CPcodes for your good bots.
Response Code
Errors as seen by search engine crawlers

With the above approach you will get alerting for free in case one of the thresholds has been exceeded. A separate CPcode opens up the option to query the reporting API filtered to Bot traffic only. 

Alternatively or on top of CPcodes you can even get real time data using DataStream by sending categorised traffic to specific Stream IDs.

DataStream

graph
DataStream aggregated stream, pulled and displayed using ChartJS.

 

Step 3: Measuring impact

Google Search Console has a rather rudimentary graph showing the average download time per day:

graph

Although not clearly visible in Google Search Console, when you take the moving average you see a clear positive trend indicating better performance after the change was made in the second week of December.

graph

Summary

Performance optimizations great for real users are not necessarily applicable to bots. 

Intelligently disabling these optimizations for bots will even speed up performance. Side benefits are lower traffic and better insights.

  • Disable prefetching

  • Disable mPulse Edge Injection

  • Disable Adaptive Acceleration

  • Assign crawler traffic to a separate CPcode

  • Assign crawler traffic to a different DataStream

  • Setup alerting (eg. increased errors for search engine crawlers)