Google’s John Mueller says content needs to be in HTML in order for it to be indexed quickly.
This is especially true for sites that frequently produce new and/or updated content.
Mueller chimed in with his advice during a discussion on Twitter about Google’s two-pass indexing system.
When Google crawls and indexes content it does two passes. The first pass looks at HTML only. Then, sometime later, it’ll do a second pass looking at the whole site.
Mueller says there is “no fixed timeframe” between the first and second pass.
In some cases it can happen quickly, in other cases it can take a matter of days or weeks.
Yeah, there's no fixed timeframe — the rendering can happen fairly quickly in some cases, but usually it's on the order of days to a few weeks even. If your site produces new / updated content frequently & you want it indexed quickly, you need that content in the HTML.
— 🍌 John 🍌 (@JohnMu) September 13, 2018
This is a big deal when it comes to SEO for web pages that use a heavy amount of client-side JavaScript for rendering.
Some details might get missed during the first pass of a JavaScript-heavy web page, meaning it will not get indexed in its entirety until the second pass.
As Mueller says, that could take weeks.
So a piece of content might not get fully indexed in Google Search until weeks after it has been published.
That’s obviously not ideal, which is why it’s critical for Googlebot to see the main content on the first pass.
Veteran SEO Alan Bleiweiss added his expertise to the discussion, saying he recently audited a site that took a big hit after going all client-side JavaScript rendering on critical pages.
And if it takes weeks to even crawl the whole site, all time bets are off with JavaScript. Just did a review audit on a site that went all client side JS rendering on critical pages. It's a mess and they took a big hit. Two more such audits coming up.
— Alan Bleiweiss (@AlanBleiweiss) September 14, 2018
Why Doesn’t Googlebot Crawl A Whole Page At Once?
The reason why Googlebot doesn’t crawl and index a whole web page on the first pass comes down to resources.
Rendering JavaScript-powered web pages takes processing power and memory, and Googlebot does not have infinite resources.
When a page has JavaScript in it, the rendering is deferred until Googlebot has the resources ready to render the client-side content.
So Googlebot might index a page before rendering is complete, then it will take some time to complete the rendering.
When the final render does arrive Google will perform a second wave of indexing on the client-side rendered content.
To hear this topic discussed in much more detail, see this 40-minute talk from Google I/O about how to deliver search-friendly JavaScript-powered websites.