Netflix has done a masterful job with providing structure to all its data about movies, through tags, attributes and categories. As a result, if you wanted to watch a ‘post-apocalyptic romantic comedy’, Netflix would probably serve up Shaun of the Dead or Zombieland.
Imagine having the type of meta data, described in the article, available for all of the products on the planet. Not only would it be very useful in recommending products that you might like, but also help target ads more accurately and deliver more relevant search results. As we see mobile technology and the Internet of Things getting increasingly influential in how we shop, this type of meta or derived product information will become critical. But tagging and classifying billions of products is a daunting task. Netflix’s manual plus algorithmic approach likely needs to be almost all algorithmic when dealing with billions of products.
Data can be gathered or derived. The derived meta data is what will help enable intuitive and targeted shopping experiences of the future described above. It’s one thing to gather information from all over the web. What happens then? You need to answer questions like: If there are two products sold at different stores, how can you programmatically tell that one is the same as the other? Is it a variant in size or color? What trends have been seen in the pricing of this product? How is this product classified? What is the narrowest sub-category that it can fit into? What are all the ways that this product can be described?
This is where machine learning techniques become important. Deriving meta data from a vast trove of product information is one of the areas we are investing in. For us, it’s not just important to collect high quality product information from wherever we can find it, but also to derive meta data that can be used for deeper analysis and understanding. In today’s day and age, data must be combined with algorithms to derive ever more insights that can be actionable.