Let’s talk! All about Bixby conversations

Voice assistants that are more conversational deliver improved experiences and greater user engagement. Bixby has a built-in conversational model designed to allow users to interact with the assistant in a natural, conversational way. This blog explains the various features of Bixby’s conversational model using specific implementation examples from the earthquakeFinder sample capsule.

Asking the user for missing information

There are times when Bixby needs additional information in order to complete a task. In those cases, Bixby prompts the user in order to elicit the required information. Prompts are an integral part of Bixby’s conversational model. Bixby will automatically prompt the user in either of the following scenarios:

  1. If some information is required, but missing – this is called a value prompt. In this case, Bixby might say: I need a <value> to continue.
  2. If there is more than one option in context, but at most one is allowed – this is called a selection prompt. In such a case, Bixby might ask the user: Which of these <values> do want?

To generate a value prompt, on an action’s input, simply add the constraint min(Required). If that input value is missing, a prompt will be generated asking the user for the missing value.

To generate a selection prompt, on an action’s input, add the constraint max(One). If more than one value is present in context, setting max(One) will provoke the system to ask which of the contextual values is desired.

Let’s look at a real-world example

The Earthquake Finder capsule allows users to search for earthquakes around the globe based on various search criteria, such as geographic location, date-time, and minimum magnitude. The findEarthquakes action asks for specific inputs that can be used to find earthquakes based on different input parameters. One of these inputs is the minimum magnitude (minMagnitude). For example, users can say “find earthquakes with a minimum magnitude of 3.0 or greater” and this will be used as an input search parameter for the API call. If minMagnitude is defined as min(Optional), then by definition this is not a required input. If the user doesn’t specify minMagnitude, Bixby will omit this from the list of input search terms. Conversely, if minMagnitude is defined as min(Required), then the user will be prompted for a minMagnitude if it is not included in the natural language request. See below for the code snippet and UI screen for the “find earthquakes” command with minMagnitude set as min(Required) in the findEarthquakes action model.

Filling in inputs by default

There is a cost associated with every turn in the conversation and developers should consider minimizing the number of turns as much as possible. You can accomplish this by setting a default value. If your capsule always requires a certain input such as minMagnitude, but you do not want to prompt the user every time, one option is to provide a default initialization for that input, which sets it to a specific default value. This is implemented with the default-init block. See below for the minMagnitude example, where minMagnitude will default to a value of 3.0 if the user does not provide an alternate input.

Inputs that can take multiple values

The previous example described the functionality of inputs with single cardinality, namely each input has only a single value. Alternatively, inputs can be defined as multi-cardinal using max(Many). These types of inputs can take multiple parameters. In the findEarthquakes action, there is an input called eventType defined as max(Many) which then allows users to search for earthquakes based on one or more specific types as defined in the USGS API. Some examples of event types include, earthquakes, quarry blasts, explosions, and ice quakes. When users specify multiple event types in their search query, for example if a user says “what quarry blasts and explosions happened this year,” the eventType input will be passed to the JavaScript code as a List. The developer would then need to iterate over all of the list values to make multiple API calls, one for each of the eventType inputs. See below for the eventType definition and JavaScript code snippet.

Having to loop through multiple API calls can become increasingly complicated, especially when your model has multiple, multi-cardinal inputs. Bixby has a built-in feature called iterable that will simplify both the Bixby modeling and JavaScript. It is implemented by marking an input as iterable rather than max(Many) in the action model. This will result in the action being called multiple times, one for each input of that type which presents itself. In the case of findEarthquakes, eventType is marked as iterable and defined as max(One). By modeling this way, the JavaScript does not need to manually loop through each of the eventType inputs, which simplifies the code. See below for the eventType definition.

Conversationally replacing a prior input

Another built-in attribute of Bixby’s conversational model is the concept of replacement. More specifically, previous values for an action’s input will automatically be replaced if a user specifies a new input as a follow-up utterance. This replacement mechanism can be implemented using continuations in your training. For example, if a user says “find earthquakes with a minimum magnitude of 4.5 or greater” and then issues a follow-up command of “how about ones with minimum magnitude of 6.0 instead,” Bixby will automatically replace the minMagnitude input of 4.5 with 6.0 and rerun the search. In the case of a multi-cardinal input, all new contextual input(s) will replace any prior values.

See below for example continuation training from Bixby Developer Studio:

Relaxing constraints: How to avoid zero results

Input constraints are designed to allow users to refine their search results, but as more and more constraints are contextually applied to a query, sometimes this can lead to an empty list of results. In order to handle these cases in a graceful way and avoid Bixby responding with I couldn’t find any results that meet your criteria, Bixby supports a feature called relaxation. When a search action returns no results, the search constraints can be relaxed by either dropping an input value or replacing an input value with a less-restrictive one. The user experience would then produce something like I couldn’t find any <objects> with <constraint1>, <constraint2> and <constraint3> but here are <objects> with <new constraint list>.

To implement relaxation, you must add an on-empty block on the output of an action. Three separate relaxation techniques are described below for the findEarthquakes action.

The first method utilizes drop-contextual-inputs. If the action returns zero results based on the search constraints, then context is cleared and all previous inputs are dropped with the most recent utterance treated like a new, top-level query. For example, if the user says “find earthquakes in Los Angeles last week” a list of earthquakes is returned. They then follow-up with “with greater than 3.0 magnitude” which filters the list to only show earthquakes in Los Angeles last week with the matching magnitude constraint. Then a subsequent follow-up “how about in San Francisco” does not return any results, because there were no earthquakes in San Francisco greater than 3.0 last week. With the drop-contextual-inputs tag, Los Angeles, last week, and 3.0 magnitude are all removed from the search constraints and the query/API call is reissued with only the San Francisco query input and the following Bixby response: I didn’t find any 3.0+ earthquakes last week in San Francisco, but here are earthquakes in San Francisco. See below for the syntax.

Instead of dropping all contextual inputs, developers can selectively decide which inputs to drop, and in which order. For example, if the user says “search for earthquakes in Los Angeles with 6.0 magnitude or greater” and if there are no earthquakes with such a high magnitude, then Bixby will remove the minMagnitude search constraint and rerun the query/API call with Los Angeles as the search region. If instead no earthquakes were found in Los Angeles (although highly unlikely), then the searchRegion would also be dropped and a search would be rerun with no input constraints. See below for the implementation.

Another way to implement relaxation is by replacing an input value with a less-restrictive one. For the findEarthquakes action, if the user specifies an earthquake search radius such as, “find earthquakes within 3 miles”, and no results are returned, the code below will replace the user’s specified search radius with 25 miles.

In general, relaxation is designed to prevent Bixby from returning an empty list of results. It is often better to return some kind of result rather than nothing, even if that result is not exactly aligned with the user’s specific search constraints, as long as the user is informed that what is being shown is not what they originally asked for.

Conclusion

This blog described the various features of Bixby’s conversational model, including:

  • Value prompts for required missing inputs or selection prompts for disambiguation of multiple contextual inputs;
  • Default initialization to prevent unnecessary user prompting for required inputs;
  • Handling of multi-cardinal inputs in either JavaScript or modeling using iterable;
  • Replacement functionality for action inputs; and,
  • Relaxing constraints when an empty list of results is returned, by dropping inputs or replacing inputs with less-restrictive ones.

Developers should factor in the conversational behaviors described above when designing their user interaction models. In the next blog, I will provide an alternate approach to Bixby conversations and overall context management that gives the developer even more control over how contextual inputs get managed during a conversation. You can download the complete sample capsule code for the earthquakeFinder capsule from Github here. Or for more in-depth tutorials, sample capsules, guides and videos hop on over to the Bixby Developer Center.

If you’re a developer who thinks they have what it takes, and this tutorial helped you develop a killer capsule, we want to hear from you! Get in on the very first wave of the Bixby Marketplace and apply to the Premier Developer Program today.

Blogs

Bixby Views: Updates Vol. 2

Welcome back to our continuing series on Bixby Views. This blog entry helps keep you up to date on the latest and greatest changes on Bixby Views! We are continually listening to your feedback and regularly updating these Views for our developers so please read on.
Read More
Blogs

Get Meme-worthy with Bixby

We had an opportunity to catch up with Alex Ren (Doge) and John Colarusso (Duck), “Doge and Duck”, the team that created the Meme Generator capsule that is now available for your entertainment on the Bixby Marketplace. Read On!
Read More
Blogs

Premier Developer Voiceter Pro: Let's TALK Real Estate

Ever want to know the prices in your favorite neighborhood? Premier Developers Voiceter Pro, are transforming the real estate industry. Find out how.
Read More