This post is a follow-up to a webinar we held on SEO and the page generator. The changes are to the core code, and therefore all SuiteCommerce sites, regardless of the version of SCA they are running.
So, while we're confident most shoppers will get the experience and data they need when they visit one of our sites, we're not sure that search engine crawlers will. If a search engine doesn't get the whole picture, this could negatively affect a site's search engine ranking, and that's not something anyone wants.
Now, that's all well and good — but what's changed?
Well, while the solution's conception is fine, we were aware of a number of limitations that were causing issues for a number of customers. Thus, we've decided to swap out the application that produces the HTML for a newer and better one.
Goodbye, V8 and Envjs; hello, Prerender.
Before we talk more about Prerender, let's take a look at two of the biggest issues that we were facing with the existing setup: poor performance and memory inefficiency.
Rendering Time (and Timeouts)
In comparison, the old system worked slowly.
- Search engines may penalize you if a page takes what they consider too long a time to load
- Your page may take so long to load, that the request times out and an incomplete page is served
This effectively put a limit on the complexity of your web pages. In some cases, sites had to compromise on the functionality they implemented on their site, just so that they would output decent HTML.
Out of Memory
In a similar vein, there were issues when a page would grow so large that the page generator could not build the whole page.
Again, this put a limit on your site, but this time with the size of your web pages. If you wanted to generate a lot of HTML, use a lot of scripts, handle a lot of data, etc, then you would frequently find yourself running out of memory.
Our solution to these woes is software called Prerender.
Prerender is a more advanced, more modern page generator. It is open source and super fast, and is the ideal replacement for our old offering.
A good thing about our implementation is that it is a direct swap out of technology behind the scenes. You don't need to upgrade your bundles or add in new code to take advantage of it. In fact, a few of our customers have been testing it and we're starting to get an idea of the benefits:
- Pages are rendered about twice as fast
- Time to the first byte is faster
- Memory use is more efficient
- More pages are covered
Our old method was to run a standard V8 JS engine with Envjs for DOM APIs. This really wasn't cutting it. In comparison, Prerender is a complete, proper solution for what we're trying to achieve with full support for ECMAScript5 and, for you massive nerds out there, the DOM4 API.
Testing and Debugging
Before we look at what we've improved, let's go over some basics for testing and debugging.
If you're already familiar with this, then you'll be pleased to know that you can still trigger the page generator's output by attaching ?seodebug=T to end of your page's URL. We also strongly recommend that you also attach a unique URL parameter at the end as well, as this will ensure that an freshly generated, uncached version is supplied.
NOTE — until this functionality is rolled to all sites, you may have to attach seoprerender=T to the end of the URL as well. This forces the application to use the Prerender engine, rather than the old one.
As things haven't changed too much, you can refer to an article we wrote a while ago about coding and debugging with the SEO page generator (note, however, that some of the information around error messaging and debug output has changed). We also have three important documents for you to examine:
- SEO Page Generator Best Practices
- SEO Page Generator Performance Statistics
- Troubleshooting Your Website
Hiding or Showing Content To or From Search Engines
SC global variable:
The classic example is not showing links to the quick view modals on items in product lists. But there are smarter things to do, for advanced users; for example, if you would normally make a call to load an external script or service (eg for a livechat program) then you can (and should) exclude this from the page generator.
For testing the page generator output, you can wrap some code in a conditional that evaluates this and see what's returned.
One of the new changes is improved output from the page generator when you do log data. Let's take a look at some of the changes.
After attaching the URL parameter and a random string, take a look at this log output:
[02:48:47.622] [ +2 ms ] Requested URL with SEO generator relevant params: https://www.example.com/?seodebug=T
[02:48:47.622] [ +0 ms ] Source URL: https://www.example.com/DEMO/shopping.ssp?seodebug=T
[02:48:47.622] [ +0 ms ] Rewrite Path: /s.nl?sitepath=/DEMO/shopping.ssp
[02:48:47.773] [ +151 ms ] Generated the frame page for the requested URL
[02:48:49.701] [ +1928 ms ] Got a response from Prerender
[02:48:49.702] [ +1 ms ] Memory usage: 17.671875MB
[02:48:49.703] [ +1 ms ] CPU usage: 0.290000s
[02:48:49.703] [ +0 ms ] Sub request total: 1.201000s
[02:48:49.703] [ +0 ms ] Details of 8 sub requests:
GET https://www.example.com/c.12345/DEMO/shopping.environment.ssp?lang=en_US&cur=USD&X-SC-Touchpoint=shopping&t=1518557601992 [status 200]
Requested at 2018-02-16T10:48:48.086Z and responded by 2018-02-16T10:48:48.173Z (which took 87ms)
GET https://www.example.com/c.12345/DEMO/languages/shopping_en_US.js?t=1518557601992 [status 200]
Requested at 2018-02-16T10:48:48.086Z and responded by 2018-02-16T10:48:48.287Z (which took 201ms)
Requested at 2018-02-16T10:48:48.087Z and responded by 2018-02-16T10:48:48.288Z (which took 201ms)
GET https://www.example.com/cms/2/assets/js/postframe.js [status 200]
Requested at 2018-02-16T10:48:48.087Z and responded by 2018-02-16T10:48:48.285Z (which took 198ms)
GET https://www.example.com/cms/2/cms.js [status 200]
Requested at 2018-02-16T10:48:48.088Z and responded by 2018-02-16T10:48:48.286Z (which took 198ms)
GET https://www.example.com/api/cms/session/domain [status 200]
Requested at 2018-02-16T10:48:48.472Z and responded by 2018-02-16T10:48:48.686Z (which took 214ms)
GET https://www.example.com/api/cms/versions?site_id=2&c=12345 [status 200]
Requested at 2018-02-16T10:48:48.688Z and responded by 2018-02-16T10:48:48.803Z (which took 115ms)
GET https://www.example.com/api/cms/pages/contents?c=12345&n=2&page_type=home-page&path=%2F&version_id=312&site_id=2&c=12345 [status 200]
Requested at 2018-02-16T10:48:48.913Z and responded by 2018-02-16T10:48:49.071Z (which took 158ms)
[02:48:49.704] [ +1 ms ] *** All requested URLs with headers (begin)
[02:48:49.704] [ +0 ms ] Header count: 9
[02:48:49.704] [ +0 ms ]
Let's take a look at some of these sections.
Timings and Timeout
Firstly, this line:
[02:48:49.701] [ +1928 ms ] Got a response from Prerender
The timing in square bracket indicates how long it took for Prerender to render the page (ie, 1.928 seconds). This can be a useful diagnostic. For example, if your page is timing out, then you might see this error:
[09:17:25.701] [ +22065 ms ] SeoGenerator:prerender:Error in SEO Page Generation. The SEO page rendered for the URL https://www.example.com/?preview=22847 &seodebug=T&seonojscache=T&seoprerender=T can be incomplete.
Here you can see that the page took over 22 seconds to render, and thus timed out.
One thing to note about our implementation of Prerender (compared to V8) is that the timeout limit has been lowered. Previously it was 30 seconds and there are few reasons for dropping it. Part of the reason for this is that we expect it to generate pages much faster, so if your page is taking about 20 seconds to render, then you've got serious problems with your page and you need to re-evaluate it.
There's another reason, which relates to HTTPS. In general, HTTPS connection attempts to NetSuite will time out after waiting for 30 seconds (regardless of whether it is by a user or the page generator), so we need to make sure that a response is given within that time (even if it's a bad one). Specifically to the page generator, both HTTP and HTTPS connection attempts will time out after 20 seconds.
Now look at this line (and the idented lines below it):
[02:48:49.703] [ +0 ms ] Details of 8 sub requests:
If one of them fails, you'll get something like this:
GET https://www.example.com/shopping.environment.ssp?lang=en_US&t=1516304321167 [status 502]
Requested at 2018-02-08T14:03:41.693Z and responded by 2018-02-08T14:03:41.784Z (which took 91ms)
Note that the status is a 502, which is the 'bad gateway' error (but it could easily be a 404, for example).
Also note that as of writing this, only GET requests are supported by Prerender. In the vast majority of cases, this won't affect you but in testing we found one site that would perform a POST on page load, which returned some data that they used on the page. I mean, that's not a terrible thing to do in itself, but Prerender couldn't perform the POST and so the data was not being called in (and therefore not being included).
Alternatively, you can use CLI commands like
curl. For example:
curl -O https://www.example.com/product123?seodebug=T&preview=23456
When pages are requested from the page generator, they may be served from the cache. If this is the case, then the output from the debugger will be the cached output — in other words, you will be served data from the first time this URL was requested, and not necessarily the latest. What we need to do is what is called a 'cache miss'; ie, get a page direct from the server, missing the cache entirely.
As mentioned earlier, you can get around this may be appending some unique parameters to the end of your request URL. Just make sure you don't accidentally append a NetSuite-reserved parameter to end — go with something like preview=<someRandomNumber>.
Disabling the Page Generator
One final thing: if you wish — and you're sure — we can disable the page generator on a per-site basis.
Generally speaking, we advise against this, but there are some specific circumstances where customers may wish to disable the page generator from firing:
- You do not want your site indexed by search engines
- You do not care about your search engine rankings
- You have access restrictions on your site, such as password-protected site
- Your site is used for something like intracompany procurement
In other words, disabling the page generator will negatively effect your search engine rankings but may provide some small performance benefits, if your pages are complex or otherwise taking a long time to load.
If you are certain you want to disable the page generator, contact support.
OK, so let's review some of the changes that are coming in with the Prerender implementation:
- No more (hopefully) 'out of memory' errors
- Improved rendering time
- Improved time to first byte
- Improved memory efficiency (so pages can be larger and more complex)
- Improved page coverage
- Improved logging
- Seamless upgrade — no need to update bundles or code
I think that's a pretty solid list for a new feature.
In addition to what was said above, here are some other small best practices our developers want to share:
- Ensure that any “server polling”-type behavior is hidden from Prerender
- Be aware that asynchronous XHR is supported, and server requests may complete out of order
console.log() is supported, and is visible in the output enabled by ?seodebug=T
Generally speaking, if your site is working fine then you won't notice anything different in your day to day operation. The changes are there to aid search engine crawling as well as debugging issues that arise while crawling. Still, it's a good opportunity to dust off your testing cap and get clicking around your site with different browsers (including the page generator) and seeing what doesn't work.