Check for Missing Product Images with the Item Search API

A frustrating thing for shoppers is when they visit a site, look up an item they are interested in, and find that the associated image is missing. As the item search API can return just about any data about your site’s products, this means we can write a little script that returns our site’s items, and then compare each item’s matrix options against its images.

It is a REST API that you can use to query NetSuite about your item inventory, returning a JSON object of your results; it's what drives the search results page, the PDP, and just about any page that uses item data.

Given the completeness of the data it can return, it also means that, if you're a developer, you can use it to quickly get back specific item data, which might be more time-consuming to get from the NetSuite UI or SuiteScript.

One such use case forms the basis of this article. A customer once asked for advice about item images. They were confident that the majority of their matrix items had associated images for each color, but were worried that they had missed a few. They were having trouble getting the information out of the NetSuite UI, and short of clicking through every color option on every PDP, what could they do to get the information back? Well, I said, you could use the item search API.

What data do we need? Well, when you make a request to the API, you tell it what field data you want back. When it comes to matrix items and their associated images, there are two that are key to our task: itemoptions_detail and itemimages_detail.

The first returns all data held about the options each item has, and the second returns all data about the images. Rather handily, the images are returned grouped by color, so it should be rather easy to write some code that goes through all the color options of an item, and then check if there are images associated with that color right?

In this blog post, we're going to do just that: use the item search API to find missing images by comparing matrix color options to images objects returned.

I'm going to use this opportunity to talk more about a number of things, namely:

Search parameters
Field sets, and specifying only the fields you need to improve performance
Page limits and pagination
The use of callbacks instead of (the now deprecated) synchronous AJAX flag

They'll be other stuff too. While a lot of these topics have already been covered before in other blog posts, I feel like it might be a good opportunity to go over a few of them again, which may be useful for beginners or those unfamiliar with the search API.

By the end of this post, we will have learned more about the API and written a script that will let us find what items on our site are missing images!

An animated GIF of a web browser's developer console. It runs the code defined in this article, and then outputs an array of items that are missing images.

Note that this script works by traversing through pages of results from the item search API. Therefore, there is a hard limit of 5,000 items that can be returned by a query.
If your web store has more than 5,000 items, you will not be able to check your entire inventory unless you use an alternative query. For example, if your site is neatly segmented into a taxonomy, you can query each of your commerce categories to build up a complete picture.

The full code for this sample code can be found in GitHub

Item Search API Basics

The API is well-documented. You can see in the docs that there are a number of areas that we cover, namely the items that are available, the base URL, the input parameters, and the request/response. There isn't much for me to say about this stuff — the recommendation is that you read all of the pages involved if you want to know more.

The short story is:

The API has access to your inventory items
It is RESTful, which means that it responds to HTTP requests (specifically GET requests)
So, you hit a particular URL and depending on the content of the URL, it will send back data as a response
The URL is formed of two parts: the base URL (which is your site's home touchpoint followed by /api/items) and URL parameters (which specify the type of data you want back)
The data returned is a JSON object with two main properties: a count of the number of items that match the query (total), and an array of inventory items (items)
Each item returned is an object with all the fields that are included in the specified field set or fields specified in the URL

The proper use of field sets is something we've looked at before because it effects performance. Using a field set that has a lot of fields in it take longer to run because it must grab more data from NetSuite before sending it to you; typically, therefore, you specify only the fields you need, or make efforts to slim down your field set.

If you want to make a call to the items API in your code, you typically do not need to write the call to API yourself: we have code you can use. For example, you can use our models and collections to get that data.

However, if you want to experiment, or use the API for something novel — like looking for missing images, then you can your browser's console to construct the request URL and work with the response JSON. And that's precisely what we're going to do.

Set Basic Variables

OK, so, let's look at the code.

To start, I'm going to set some variables that I'm going to use throughout the script. Remember, we're not coding a module or extension here: this is just a simple script, so you can store it wherever you want.

// Useful variables
var limit = 100 // The maximum results we can get back at once from the API is 100 items, so we force it to that value
, offset = 0 // This is to create pages in the results, so we can step through the results
, itemCount = 0
, noImages = [] // Where we're going to store the results

So, we start with the limit. The search API will only return a maximum of 100 items in the items object, even if there are more (keep in mind the total property does not always match the the number of items in the items object). The default is much lower, so we're going to force it to 100 so that we can process more items at once.

Next, we set the offset to 0. The offset and the limit are linked — it's effectively a page marker to indicate where we got to. Thus if we return the first 100 results as one 'page' of results, we set the offset to 100 and make the call again to show us the second page of results.

The item count will be used to store the total number of results returned from the first API response. We need this information to determine whether we need to make an additional call (ie call the next page of results) or whether we've exhausted the inventory can return the list of items with missing images.

Finally, we have an empty array that we're going to use to store details of the images that are missing images.

Construct the URL

Next, we need to think about the query we want to construct, which will be in the form of the URL. We need a base URL plus some URL parameters.

Below the variables, add in:

// Build the URL
, locale = SC.ENVIRONMENT.currentLanguage.locale.split("_")
, url = SC.ENVIRONMENT.siteSettings.touchpoints.home
+ '/api/items'
+ '?q=' + '' // We're providing no query string so this will return all results, although you could use it and narrow down results
+ '&language=' + locale[0]
+ '&country=' + locale[1]
+ '&currency=' + SC.ENVIRONMENT.currentCurrency.code
+ '&c=' + SC.ENVIRONMENT.companyId
+ '&limit=' + limit
+ '&fields=' + 'itemid,itemimages_detail,itemoptions_detail'
+ '&timestamp=' + Date.now();

We're setting two variables here, but I've split the second across multiple lines to make it easier to visualize.

First, we're getting the user's current language: this will come in two bits, the first half contains their language and the second is their country.

Then we construct the URL. You can see that we start with the site's home touchpoint, which should be the base URL for your site (eg, something like https://www.examplestore.com) and then we tack on the bit that accesses the item search API. Then come the URL parameters we're going to use.

We start with an empty keyword query. We're passing an empty string with q, but you could modify it to pass a particular keyword — thus allowing you to filter the results for particular products.

We then do a number of boring tasks: set the language, country, currency and company. Depending on your site, you may not need to include these at all but they will become necessary if you operate multiple companies, currencies, languages, etc. You may also need to include the region parameter if you operate multiple subsidiaries and n if you operate multiple SuiteCommerce sites. Modify as necessary.

Then we set the limit as previously mentioned. Again, this will tell the API to return up to this many items.

The next, the fields parameter, specifies the precise data we want back from NetSuite. As previously mentioned, limiting the fields will improve the performance of the request; for our uses, we're only interested in the item ID, the images each item has, and its item options. We use this parameter instead of the fieldset parameter, but we could use an existing field set (if we're lazy) or create a whole new field set just for this purpose.

Finally, we append a timestamp to the URL which is the current time in milliseconds. Strictly speaking, we don't need to do this — and I should point out that this is not an official field that will be stripped out by the search service — but I'm attaching it as a curiosity. Repeated requests to the same URL may cause a cache hit, resulting in the first response being sent each time (either from the CDN, NetSuite or your browser's cache) — by attaching a unique parameter you are making a unique request each time, so you know the results are fresh. (This is fine for tinkering like we're doing, but it shouldn't be used in production.)

And that's the URL. So, if you like, you can actually test it now by doing something like this:

jQuery.ajax(url).then(
  function (data)
  {
    console.log(data)
  }
)

Which returns something like this:

A screenshot of a web browser's developer console. It shows the above command being run and then returning the promise object and the data object containing the items and their data.

You can click through the items array and look at the fields returned for each item. From here, you should be able to get an idea about what we're going to do.

Processing the Results

So, we have our images and our item options. If you look at the data, you'll see that it's structured: if you look at the media objects in the images object, you can see that we organize them by color; if you look at the values of the color fields, you'll see they're labeled. We just need to check them off against each other.

Add the following function to your code:

// After the call is made, we will use this function to process the results
function processResults (data)
{
  // Iterate all items returned
  data.items.forEach(function (item)
  {
    // First does a simple check: are there *any* images? No, then push that item to the array
    if (_.isEmpty(item.itemimages_detail))
    {
      noImages.push(item.itemid + ' (all colors)')
    }
    // Otherwise, move on to a more complicated check
    else
    {
      // Iterate all item options
      item.itemoptions_detail.fields && item.itemoptions_detail.fields.forEach(function (field)
      {
        // Find the one we use for color
        if (field.label == 'Color')
        {
          // Iterate over every color value
          field.values.forEach(function (value)
          {
            // First check that it's a legitimate color
            // Then try to call the object that for that color value, if it returns `undefined` then we know it's missing (so push that item and its color to the array)
            if (value.internalid && !item.itemimages_detail.media[value.label])
            {
              noImages.push(item.itemid + ' (' + value.label + ')')
            }
          });
        }
      });
    }
  });
}

We're skipping over making the call for a moment, and instead we're looking at what we're going to do with the data once we have it.

The first thing we do is start a loop. We want to go through every item and check it. The first thing we're going to do is just check to see if there are any images at all by using the isEmpty() Underscore method which just checks if the object has no values. If it is empty, we push an entry into our array that says that <product name> is missing all color options.

If there are images, then we need to associate them with their colors. So, we then need to iterate through each one until we find the one we use for item color options. I'm doing the matching by checking the field label for the word 'Color'; this is obviously in English and is perhaps not that rigorous, you can, of course, use the internal ID for the field (ie the custom column).

When we have those, we can start another loop to go through each of color values. There are two important checks that we perform here:

Is this a legitimate color value?
If we try to call the images object for this color, does it not return values?

For the first check, there is a quirk of NetSuite that we return, for every option, a dummy option for dropdowns which simply encourages shoppers to make an option — we obviously don't want to include these in our list of missing images, so we just ignore them if they don't have an internal ID.

The second check works on the theory if that if we try to call a color property on the media object that does not exist, it will return undefined. Thus, if we negate that with a !, it will return true.

Thus, when both of these conditions are true, we know we have found an item with missing color option images — so we push it into the array.

Make the Call

And that's it for the processing: as we saw, it's just a case of matching the two data points to each other. Now, we need to focus on three things:

Making the API calls
Calling the data processor (and monitoring/reporting its status)
Deciding whether we need to call the API again for another page of results

So, I'm going to split the final function into these three parts. Start with this:

function findMissingImages (preserveCount, oS)
{
  // If no parameters are provided, we can assume that this is the first time it's being run
  itemCount = preserveCount ? itemCount : 0;
  offset = oS ? oS : 0;
  noImages = preserveCount ? noImages : [];

  // Perform the search
  jQuery.ajax(url + '&offset=' + offset)

This is our wrapper function that you will call when you want to find missing images. It accepts two parameters: a boolean for whether we want to preserve the counts, and the second is the offset value. The idea is that when we make calls to the API, we need to know if this is a fresh API call (ie start from 0 and get the first page of results) or if we need to paginate.

This plays out in the three variables we defined previously: the item count, the offset, and our array of missing images. We use ternary operators to determine each of their values, which operate as shorthands for conditional statements. Basically, we're asking if we should keep their values or reset them, based on what the parameters we're passed say.

We then make the call using the URL plus the offset number; obviously, if this is the first call then the offset will be set to 0 and we will get the first page of results; if it's not, it'll be incremented to load the next page of results.

Callback #1: Feedback, Status Update, and Data Processor

As we're going to potentially making multiple calls to the API for the various pages of results, and the default is now to perform AJAX calls asynchronously, we're going to chain them using using then(), which will allow to us to perform them synchronously. It also means that we can tell the code only to run when we know that we've got a result back from the server.

For more information on this, take a look at the article on jQuery promises and deferred objects.

Add the following to your script:

// The first callback provides console feedback, increases the offset, and calls the processor function
.then(function (data)
{
  // We need the item count to know whether we need to make further API calls, and for status reporting
  itemCount = data.total;

  // (Optional console feedback)
  min = offset + 1; // Computers start at 0; humans at 1
  max = offset + limit > itemCount ? itemCount : offset + limit;
  console.log('Processing ' + itemCount + ' items: ' + min + ' to ' + max + ' done.');

  // Increase the offset count
  offset += limit;

  // Call the data processor
  processResults(data);
})

So, once a (successful) response comes back from the server, we set the item count based on the total number of items. This is going to help inform us on whether we need to make another call for another page of results.

Then we do some optional feedback, just to keep the user (you) informed with progress updates. It's quite simple, but it fires after each successful response from the API. It will tell us the total number of items, as well as what items in that amount it is about to process.

We then increase the offset by adding the current limit to the current offset value. Thus, if we set the limit to 100, and we're currently on the first page of results then it will be 0 + 100 = 100. If we need to make another call straight afterwards, then we know where to start from.

Then, finally, as you might imagine, we process the results using our function above.

Callback #2: Call Again?

The second callback is chained to the first and determines whether we need to process more results.

Add the following to the end:

  // The second callback decides whether we need to call the API again for the next page of results
  .then(function ()
  {
    if (itemCount > offset)
    {
      // If we do, we preserve the current item count and provide the position from which to start from
      findMissingImages(true, offset)
    }
    else
    {
      // If we don't, we dump the results into the console
      console.log('Search complete: ' + noImages.length + ' items with missing images', noImages)
    }
  });
}

The simple check we do is comparing the item count to the offset value. If the item count is higher, then we know that there are still items that need checking. So what we do? Well, we just call the whole function again — but this time, we tell it to preserve the counts and pass it the offset value. The process then obviously starts again.

If the offset is higher then we know that we're done, and we can log the results to the console.

And that completes the function and the script.

Test It Out

As I said, this isn't a module or extension that you add to your site: this is just something that you run from your browser's console. So, do that now — just copy and paste the whole script into the console and run findMissingImages()! You should get something like this (assuming you have missing images, that is):

An animated GIF of a web browser's developer console. It runs the code defined in this article, and then outputs an array of items that are missing images.

If you have no missing images then you won't see anything.