This is the fourth in a series of blog posts looking at the parallels between locations and products.
Thus far, we’ve looked at the parallels between locations and products with both unique(ish) identifiers like latitude/longitude, GTIN, and other ways to call out specific locations or products like addresses and product titles. But the funny thing about language and the way we talk about nearly everything is that we have other, less specific ways to indicate places and things. We tend to use these less specific ways when we’re looking for things, and it has to do with how our brains are wired for categorization.
Humans brains form categories for three primary reasons: First, people notice similar features. Second, people notice similar functions. Finally, language provides cues that a category exists that people can identify with. All three of these reasons overlap to form categories.
Because language and brains form categories, machines need to be able to understand categories. When kids want Alexa to play Christmas music, Alexa must be able to find the “Christmas music” category and play the appropriate songs. If you’re looking for an Italian restaurant, bots must also be able to find restaurants that fall into the “Italian restaurants” category, so that you can find your favorite nearby.
These examples comprise both location and product categorization information. For locations, we categorize them into residences, businesses, points of interest, or general areas. Then we further categorize residences by type—single-family home, apartment building, etc. We categorize businesses by general type (e.g., restaurant, bank, store) and then sub-categorize each type (e.g., Italian restaurant).
Product categorization is also hierarchical. If I’m looking for a blue sweater for a birthday present for my mother, I’m looking for clothes as my major category. Beyond that, I’m looking for women’s clothing, and then tops, and then sweaters. Once I find sweaters, I look for blue ones. Interestingly, different high-level product categories have different sub-categorizations, just as residences and businesses have different sub-categorizations. We don’t look for women’s automotive parts, and we certainly don’t try to find glass video games.
For categories to be useful, they must be at least somewhat standardized. If I asked Alexa to find me stores that sell fluzelblogget, it wouldn’t know where to start, but if I asked for stores that sell women’s shoes, I’d get some results. Unfortunately for those of us who work with products for a living, location and product have significantly diverged in our development and use standards.
The government plays a huge role in how location standards develop. Governmental postal organizations govern address formats and postal coding regulations. Each country has a standardized way to talk about places down to the city, state, and county level. The United States Department of Labor established a Standard Industrial Classification (SIC) system in 1937 (which is sometimes supplanted by the North American Industry Classification System (NAICS)) to use in economic analysis. Over these standards, we overlay a system of sub-categorization that allows us to be more specific when talking about places. The sub-categorization may have to do with types (e.g., Chinese restaurants or dive bars) or brands (e.g., Nordstrom department store or Paramount Theater).
Contrary to locations, products have had limited governmental standardization. GS1, the same organization that administers UPCs and GTINs, maintains Global Product Classification (GPC) standards. However, even the most popular online retailers do not in any way follow these standards. Let’s take the blue sweater for mom as an example. Here’s what the GPC looks like for it:
Contrast this with the way that Amazon and Nordstrom categorize women’s sweaters:
Obviously, neither retailer follows the GPC standards. GPC’s hierarchy is:
Clothing -> Clothing -> Upper Body Wear/Tops-> Sweaters/Pullovers-> Gender->Female
Amazon has a simple hierarchy in their fashion department:
Women-> Clothing -> Sweaters -> <style>
Nordstrom has a more complex hierarchy:
Women-> Sweaters -> <style, brand, size>
Given the complexity and awkward readability of the GPC standard, is it really a mystery as to why retailers create their own? Unfortunately, the linguistic awkwardness of the GS1 standard means that no retailer conforms to a single standard, and we find products all over the place.
Without standards, the concept of accuracy in product categorization is basically nonexistent. This is both good and bad in comparison with location. Because postal standards have been widely adopted, an inaccurately categorized piece of mail (e.g., that has an incorrect zip code) will never reach its destination. Whereas inaccurately categorized products are considered more of a fact of life—even at Amazon. I found an inaccurately categorized product for Vitamin E on the first page of Amazon’s search results—Jojoba Oil is in the first six products listed:
Adding to the accuracy issues is the rapid change of the internet. Even in the most aggressive urban development, new locations (or location changes) only occur occasionally—the rate of change is relatively slow. Contrast this with the rapid development of new products plus the rapid proliferation of ecommerce sites; maintaining categorization accuracy with so many products and sites is nearly impossible.
Nevertheless, we will continue to use product categorization. As bots and other AI assistants become more prevalent, categorization will continue to become more important and errors will be even more jarring. Standardization would be valuable, but the GS1 standards are too far from “standard” human language to be useful. Category standardization is only one of the complex problems that we as an industry need to solve, but complexity is a topic for another day.
Also published on Medium.