Everything you need to know about Googlebot’s changes to JavaScript SEO 2019.

What is it? How important is it? 

You want to learn ‘how rocks form’, so you open your encyclopedia – but where do you look?  The index of course! Search engines “crawl” the internet and make copies of websites to create their own index. 

Before we look at JavaScript SEO, let’s take a look at the Google search engine and how it crawls. 

“Crawlers look at webpages and follow links on those pages, much like you would if you were browsing content on the web. They go from link to link and bring data about those webpages back to Google’s servers. We take note of key signals — from keywords to website freshness — and we keep track of it all in the Search index.” 

Google 

Now we know how Google crawls and indexes, let’s take a look at:

The Basics of Optimizing Crawling  

1. Site & Link Structure

Internal: Crawlers are looking to visit every page of your website. To make its job easier, it would be helpful to link pages to each other in a web-like fashion to increase the discoverability. 

External: Crawlers are looking for support that the pages on your website are of sufficient quality. This is best achieved with external links to your pages to aid discoverability. 

  • The UX Design, Usability & Navigation. Use a breadcrumb trail to leave links reflecting the structure of your website. XML Sitemaps work in favour according to Google’s documentation.  
  • Avoid pages that too similar in content to each other. This prevents Google from figuring out which page is most important. 
  • Fresh content. Keep up with new fresh content and remove outdated and redundant content. (Content: Products, blogs, videos etc…).
  • Categorising & using tags
  • Avoid overusing keywords. 

A case study of 27,000 competitive keywords conducted in 2019 by Moz found there is a clear indication that the nature and the quality of links on a site influence the page ranking. The total links, Domain Authority & Page Authority measured by the Mozbar chrome extension found to have a correlation factor of 0.293, 0.327 & 0.321 respectively. Where a correlation factor above 0.2 indicates a strong relationship to page ranking.

When it comes to fresh content

  • The initial date of the post
  • How recently it was updated
  • Changes made in the core content of the page matter the most
  • The rate of changes in the content
  • The addition of the pages & fresh links
  • The traffic to ones site 
  • Anchor text should remain fairly constant

2. Looped Redirects & Server Errors

The first action a crawler takes is to look at the HTTP header of your website. Here it will find a status code such as 404, 305 and 202. Googlebot will use this to determine the health of your page. The goal is to maintain the health of these status codes. You can view what each status code means at HTTPstatuses.com and track the health of a link using MozBar.  

3. Scripts and Technology Factors

JavaScript is essential in the modern web
image credit: Google

Sites receive a crawl budget; this is the time allocated for the search engine to crawl per day. The trust and authority of a site indicates the amount of time allocated. Best practice would be to lead the crawler to the “most targeted” pages so it is crawled more. Crawlers cannot follow forms. This means applications such as Flash or Ajax may hide content. As of July 18th 2019, Google added a JavaScript SEO basics to its search developers guide. A well-designed website will:

Most crawling & rendering is mobile
image credit: Google

4. Blocking Web Crawler Access

Robots.txt is a file that prevents crawlers from certain pages. This may be used to hide certain parts of a website to lead a crawler to fresh content or many other applications. Google announced on July 1st, 2019, that the robots.txt protocol is working towards becoming an Internet standard. 

In an early 2019 case study by tl:dr SEO found that Google treats the disallow as a directive that doesn’t allow Googlebot to crawl the page. An important note to remember is “disallowing” via robots.txt will keep the URL in the index, however, using “noindex” will entirely remove the URL from the index but can still be crawled.

In an announcement on the Google Webmaster blog, as of September 1st 2019, Google will stop supporting unsupported and unpublished rules in the robots exclusive protocol. Best practise would be to update robot.txt files that are using nofollow or crawl-delay commands. The most up-to-date validations of robot.txt can be found here.

Now that we have the basics down-pat, let’s look at JavaScript SEO & web-applications. 

As of July 2019, Google made a few changes to how Googlebot processes JavaScript.  JavaScript is used as client-side programming language by 95.2% of all websites. You may have used libraries such as jQuery, JSON, Backbone.js, and Underscore.js in WordPress to create web applications. A website called thirdpartyweb shows the average cost, the total impact and popularity of common web-applications and their performance / JavaScript SEO.   

What is Caffeine?

It’s not just something you find in your coffee. It’s the name of the “indexer” that Googlebot uses. Caffeine is also in charge of rendering JavaScript web pages.

The Hulu Case Study & JavaScript SEO:

In an analysis by onely, the popular media sharing platform Hulu was found to have a major SEO problem. Hulu was a primarily Javascript based website. It was found that it suffered major indexing problems as the Googlebot failed to receive basic content from the site. 

It was found that Hulus internal search algorithm interferes with Google crawler, and therefore making it impossible for one to see the same content a visitor would see. Another issue was that displaying content such as titles, descriptions, etc required JavaScript to be enabled.

Proposed solutions include prerender.  Prerender is an IO that renders your JavaScript in a browser and saves the static HTML. This is then rendered to the crawlers. In the case of Hulu it was far more complex due to the architecture of their site. There are however some issues to Prerendering such as computing power, downtime & a knowledgeable dev team.

The best way to overcome such issues is to follow the Google JavaScrit SEO Tips

10 Tips Google recommends for JavaScript SEO:

  • Avoid user permission requests. Let’s say you have a microphone API on your page. Googlebot can’t provide a microphone, hence you should provide an alternative route. 
  • Avoid data persistence to serve content. There are two key points to take away from this. When Googlebot crawls your website:
    1. Local Storage and Session Storage data are cleared across page loads.
    2. HTTP Cookies are cleared across page loads.
    This means your website should avoid feeding content based on the data of the user. 
  • Making your applications search-friendly. You can achieve this by organising the components into shadow DOM and light DOM. This maximises the compatibility for clients who may not support web components or execute JavaScript. 
  • Describe your page with unique titles and snippets.
    1. Make sure every page on your site has a title specified in the <title> tag.
    2. Descriptive and concise. Avoid generic titles like “Home” and “Profile”.
    3. Avoid keyword stuffing.
    4. Avoid repeated or boilerplate titles. This means avoid using the same description and title over again.
    5. Brand your titles. Give it some unique catchy additional information.
    6. The no index directive. Google can still find your page even if you use robots.txt through external links. Using ‘noindex’ will prevent this.
image credit: SEO pressor
  • Write compatible code.
  • Use meaningful HTTP status codes. As previously mentioned, using the correct HTTP status will ensure crawlability.
Advance image embedding
responsive images
Lazy Loading
image credit: Google
  • Fix images and lazy-loaded content. Images are taxing on bandwidth and hence the use of the WebP image format, as well as “lazy loading”, will improve load times. To ensure Googlebot recognises all the lazy loading content on your page, you can look at these guides:
    1. IntersectionObserver API and a polyfill
    Always make sure to test your implementation using puppeteer.

Site Crawlability & JavaScript SEO is one of a multitude of factors that affect the SEO of your content.

Leave a Reply

Your email address will not be published. Required fields are marked *