Using Screaming Frog
Updated over a week ago

Configuration

Use this configuration file to get the required results: SEO Spider Config.seospiderconfig.

1350

Manual configuration:

1860

🚧 If you are using the configuration file above, manual configuration is not required.

After downloading Screaming frog:

  1. Navigate to Configurations > Robots.txt > Settings > Ignore and select the settings below.

    1860
  2. Navigate to Configurations > Spider > Extraction > Structured Data and select the settings below:

    1122
  3. Navigate to Configuration > Spider > Rendering, select JavaScript and enter 10 second timeout.

    646

Tests to Perform before Launch

Images

  • All images should be hosted by f.shgcdn

  • Check for wrong CDNs

    1. Scrape subdomain

    2. Click on Images Tab

    3. Type the following regex into the Search bar:

      /(?!.*(quality|afterpay|airtable|dropinblog|facebook|getelevar|ggpht|googleapis|googletagmanager|instagram|klarna|klaviyo|pinterest|stamped|tiktok|yotpo|ytimg))/
    4. Export results

    5. Fix as needed

  • Images missing height/width attributes

    1. Scrape subdomain

    2. Click on Custom Extraction Tab

    3. Filter by Images missing height/width attributes

    4. Export results

    5. Fix it =)

Structured Data

  1. Scrape subdomain

  2. Click Structured Data Tab

  3. Filter by Missing

  4. If any URLs appear, make note of them here.

  • Missing Structured Data: These are URLs that do not contain any structured data.

  • Validation Errors: These are URLs that contain validation errors. The errors can be either Schema.org, Google rich result features, or both depending on your configuration. Schema.org issues will always be classed as errors, rather than warnings. Google rich result feature validation will show errors for missing required properties or problems with the implementation of required and recommended properties. Google’s required properties must be included and be valid for content to be eligible for display as a rich result.

  • Validation Warnings: These are URLs that contain validation warnings for Google rich result features. These will always be for recommended properties rather than required properties. Recommended properties can be included to add more information about content, which could provide a better user experience, but they do not disqualify you from being eligible for rich snippets. There are no warnings for Schema.org validation issues; however, there is a warning for using the older data-vocabulary.org schema.

  • Parse Errors: These are URLs containing structured data that failed to parse correctly. This is often due to incorrect mark-up. If you’re using Google’s preferred format, JSON-LD, then the JSON-LD Playground is an excellent tool to help debug parsing errors.

  • Microdata URLs: These are URLs that contain structured data in microdata format.

  • JSON-LD URLs: These are URLs that contain structured data in JSON-LD format.

Identify 4xx and 5xx Links

  1. Scrape subdomain

  2. Click Bulk Export (top)

  3. Response Codes

  4. 4xx Inlinks

  5. 5xx Inlinks

Check that all pages have canonical links

  1. Scrape subdomain

  2. Click Canonicals Tab

  3. Click All filter and select Missing

  4. If any URLs appear, make note of them here.

Check for Headings (H1 and H2)

This check can be done for h1 and h2 tags, but each page must have one h1 tag.

  1. Scrape subdomain

  2. Click H1/H2 Tab

  3. Filter by Missing/Duplicate

  4. If any URLs appear, make note of them.

Check for Title and Meta Description on all pages

  1. Scrape subdomain

  2. Click on the Page Titles tab then click on the All filter and select Missing

  3. Click on the Meta Description tab then click on the All filter and select Missing

  4. Report any URLS with missing titles and descriptions.

  5. Also note the length of the title and Meta Descriptions.

Actions:

  • If title or description is missing from pages that are not tied to a Template Page, fix them on a Page-Level under Page Settings in the XM

  • If title or description is missing from pages that ARE tied to Template pages, ensure the Page Settings are configured correctly on the Template Page

Page Title Length

  • Over 60 characters – Any pages which have page titles over 60 characters in length. Characters over this limit might be truncated in Google’s search results and carry less weight in scoring.

  • Below 30 characters – Any pages which have page titles under 30 characters in length. This isn’t necessarily an issue, but you have more room to target additional keywords or communicate your USPs.

Meta Description Length

  • Over 155 characters – Any pages which have meta descriptions over 155 characters in length. Characters over this limit might be truncated in Google’s search results.

  • Below 70 characters – Any pages which have meta descriptions below 70 characters in length. This isn’t strictly an issue, but an opportunity. There is additional room to communicate benefits, USPs or call to actions.

Check the CLS score

Why it's important? → https://web.dev/cls/

  1. Scrape subdomain

  2. Click PageSpeed Tab

  3. Report any URLs with Cumulative Layout Shift value grater than 0.1

Did this answer your question?