Monday, June 23, 2014

Drastically improving 'First Byte' and 'Page Load' (for SEO)


Improving your 'first byte' speed, and in general your 'page load' can be crucial for SEO. Google likes pages that render faster to the user, and, in some cases, will prioritize them higher than other pages in search results.

If you're not familiar with this, then here are some articles on the subject :
http://googlewebmastercentral.blogspot.co.il/2010/04/using-site-speed-in-web-search-ranking.html
http://blog.kissmetrics.com/speed-is-a-killer/
http://www.quicksprout.com/2012/12/10/how-load-time-affects-google-rankings/

Improving your site's performance can be a daunting task. There are probably many easy wins you can do that will improve the speed by a little, but quickly you will realize that better results will take much longer. Some improvements can take days, weeks and even months of infrastructure changes.

But why should your SEO suffer from this ?? Why not be a step ahead of google ??
Your site doesn't really need to be fast for you to get good SEO scores, you just need google to think your site is fast!

But how do you do that ?
Google will scan your site once every few days/weeks and cache the results for indexing. So let's beat google to it's own game.
Why don't we crawl our site first, cache the results to text files even, and when google comes around, just serve it the static pages we cached without any server calculations.

You can easily build a crawler using Selenium, phantomjs, zombiejs or pure nodejs. You don't even need to implement all the logic of a regular crawler since you're familiar with your site's domain.

For a real world example :
If your site is a big commerce site, then you know the structure of all your product pages. They're probably something like this :
http://www.YourCommerceSite.com/product/Product-Name/:Product-ID:

You can invoke this endpoint, while scanning all your different product id's from your db.
Then you can save them all to text files like this :
Product_.txt

When the google bot comes around (which you can easily detect by it's 'User-Agent' header) and requests a product page, then quickly give it the cached product page you stored on disk.
This might be stale by a few days/hours (as frequent as you decide to scan) but will still be good enough for indexing in google (since google's indexing isn't realtime anyway) and should be super fast!

6 comments:

  1. User agent sniffing is against Google Guideline! You will shoot yourself in the foot by applying that method. But you do have the right idea. You can uses those tools to spider your site to created a rendered version of your site and then serve those pages. The equivalent of doing that would be to set up memcache or varnish with long expire times.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete