In this post, we’ll cover how chatbot designs address the particular needs of enterprise applications. We compare the traditional chatbot approach to the nl2api approach.
Most chatbots and AI-based virtual assistants, whether delivered via text or voice, are at their heart a combination of two technologies: nlp (natural language processing) and decision trees. A decision tree guides the conversation as a whole, while nlp is used at each step to interpret the user’s input.
To implement a chatbot, a developer first decides what tasks they’d like to support for their users. Each task may involve multiple steps and call upon various actions in their system. For example, ordering a pizza involves the user selecting a size, picking toppings, confirming an address and payment mechanism, and estimating a time for delivery.
For each task, the developer provides inputs for each of the two components of the chatbot:
The developer fully maps out the conversation and all its possible paths. At each node, she must specify what system calls are run, what is being communicated to the user, and what to do with the input the user provides. This is the script the conversation must follow.
For each node in the decision tree, the developer provides example user response phrases and key words to detect. This is the semantic recognition available at each step of the conversation’s script.
But when it comes to enterprise applications, traditional decision-tree chatbots face challenges due to application complexity compared with consumer apps, including:
Consumer application actions typically require a small number of scalar inputs from users. In our pizza example above, the user might have to choose from three or four sizes and a dozen toppings, then confirm address and payment info.
Enterprise application actions may employ far more inputs, often not simple scalars. Take AWS Lambda’s API. Its UpdateFunctionConfiguration action has 15 inputs. Of those 15 inputs, six are objects with sub-inputs. One argument, Environment, is even more complex: an object of an object with a variable number of scalars at its base level.
How do you deal with this complexity when using a decision tree-based chatbot? You could write a decision tree branch for every leaf node input (at least 19 of them for the Lambda action we discussed). That manual labor scales poorly and results in large decision trees more likely to contain bugs. Or you could enforce simplified interactions by offering fewer choices to the user. That greatly limits what can be accomplished through the chatbot. The latter option is what most enterprise chatbots do today – provide a handful of common interactions that limit the chatbot’s usefulness.
Understanding context is key to carrying on a conversation. Decision trees are poor for handling context because they are scripted interactions. If a user wants to make a related request that isn’t planned for at the current decision tree node, the chatbot must interpret it as a completely separate action. If the user’s request could be interpreted as referring to multiple things, it becomes difficult to use knowledge of the current context to interpret the user’s intent correctly.
Changelogs for new versions of an app are also more likely to be complex for an enterprise app than a consumer app. Even if many changes take place to the backend of a consumer service, frontend changes are often minimal to preserve user experience. But for an enterprise app, new versions are more likely to include extensive changes that affect user interactions. If you’ve ever been in an organization that’s updated to a new version of an enterprise software service, you’ve probably seen how much effort IT must expend to ensure the changeover is smooth.
The more changes there are, and the more user interactions are affected, the more decision trees will break and need updating. This highlights the lack of robustness of decision trees. Every branch may need to be checked for errors.
IT groups run chatbots for automated internal technical support that often interact with multiple software services. Capturing the interactions among these services requires more decision tree branching, again resulting in larger, more buggy trees. Setting up these trees also involves significantly more work to ensure all the interactions are appropriately handled.
nl2api (natural language to API translation) is a better approach to AI text and voice assistants for enterprise applications because it introduces strong modularity. Each action is defined independently for the AI and their combinations do not require pre-determined conversation scripts. To sum up the difference from traditional chatbots, decision trees define interactions the chatbot may have while nl2api defines capabilities the chatbot can leverage.
The benefits to this approach are:
Handling Action Complexity
Rather than asking the user input questions according to a decision tree, nl2api chatbots generate input questions on the fly. They also consider other ways to retrieve the argument values from connected actions in the service graph. The provider does not have to code these checks themselves to reduce the burden on the user.
Handling Context Complexity
The distance between actions, inputs, and outputs in the graph provides context that can be leveraged in a conversation. Nodes closer to the current action are weighted more highly for matching against user intent, especially if the user’s phrasing is ambiguous. When a user asks a sidebar question in the context of a larger request, it’s easier to understand, handle, and incorporate into the current task.
Handling Versioning Complexity
Since nl2api actions are defined independently, updating them to a new version doesn’t break interactions with the user. Any graph edges that touch updated action nodes are automatically recomputed.
Handling Cross-service Complexity
nl2api actions are also independent among services. Service interoperability is achieved through a set of common definitions established by PowerUser to which each service refers. Cross-service connections are therefore also automatically established where warranted.