For many years, effective voice-based search technologies have eluded businesses that have tried to bring next-generation input methods to customers. Command-based speech systems have been perceived as ineffective and hard for viewers to use. However, the widespread adoption of smartphones and tablets, and their minimised keyboards, has led to a renewed interest in this genre of technology. For example, Apple’s Siri and Amazon’s Alexa have progressed beyond basic menu navigation functions. In fact, any device with a microphone has potential for speech-based commands, and can become an intelligent discovery system that uses a sophisticated entertainment brain to understand customer desires.

Sue Couto, Senior Vice President, APAC Sales, TiVo

This technology is important and under-explored by the TV industry, which often appears to have been left behind in terms of intuitive discovery functionality. For content providers, voice-based search and recommendation should be a core part of their customer service provision to provide customers with accessibility to their favourite shows and genres.


Speaking the viewer’s language


With the chaos of content available today, consumers have preferred selections and considerations across cast, plot and genre. Conversational interfaces simulate natural communication qualities and remove the need to conform to hierarchical menu structures. Most importantly, the technology must understand when a user is drilling into a particular genre in detail, or when they have lost interest and have completely switched topics.


To be successful, natural language search needs to encompass a variety of different points, each crucial to success:


  • Disambiguation: Natural language technology must understand and interpret the user’s intent. For example, the phonetic sound “Kroos” can be interpreted to apply to Tom Cruise or Penelope Cruz, and the system should be able to understand what the user is looking for in relation to the original query.


  • Statefulness: During a dialogue with a user, the system should be able to maintain context, and understand that people change their minds quickly. For example, the user could say that they are “in a mood for thrillers,” then jump to “Bond” and then to “old ones”. Ideally, the system should understand these requests, and serve up a series of older James Bond films for the viewer to select from.


  • Personalisation: Conversational systems need to understand their users on an individual basis. For example, the system should learn that a user based in New Zealand who asks “when is the game tonight” wants to know about their local team, and if they say, “when is the Blacks game” they mean the rugby team All Blacks.


Taking understanding to the next level