NLP vs NLU: What’s the Difference and Why Does it Matter? The Rasa Blog

According to Zendesk, tech companies receive more than 2,600 customer support inquiries per month. Using NLU technology, you can sort unstructured data (email, social media, live chat, etc.) by topic, sentiment, and urgency (among others). These tickets can then be routed directly to the relevant agent and prioritized. Natural language processing is a subset of AI, and it involves programming computers to process massive volumes of language data.

NLU design model and implementation

In short, prior to collecting usage data, it is simply impossible to know what the distribution of that usage data will be. In other words, the primary focus of an initial system built with artificial training data should not be accuracy per se, since there is no good way to measure accuracy without usage data. Instead, the primary focus should be the speed of getting a “good enough” NLU system into production, so that real accuracy testing on logged usage data can happen as quickly as possible. Obviously the notion of “good enough”, that is, meeting minimum quality standards such as happy path coverage tests, is also critical. The Dual Intent and Entity Transformer (DIET) model for natural language processing (NLP) is implemented in RASA, which is an open-source implementation.

Make sure the distribution of your training data is appropriate

This dataset distribution is known as a prior, and will affect how the NLU learns. Imbalanced datasets are a challenge for any machine learning model, with data scientists often going to great lengths to try to correct the challenge. So avoid this pain, use your prior understanding to balance your dataset.

NLU design model and implementation

When building conversational assistants, we want to create natural experiences for the user, assisting them without the interaction feeling too clunky or forced. To create this experience, we typically power a conversational assistant using an NLU. Both NLP and NLU aim to make sense of unstructured data, but there is a difference between the two. Dashbot is pivoting from a reporting tool to a data discovery tool focussing on analysing customer conversations and clustering those conversations into semantically similar clusters with a visual representation of those clusters. Cognigy has an intent analyser where intent training records can be imported. With a Human-In-The-Loop approach, records can be manually added to an intent, skipped or ignored.

NLU design: How to train and use a natural language understanding model

Note that the the above recommended partition splits are for production usage data only. So in the case of an initial model prior to production, the split may end up looking more like 33%/33%/33%. On the other hand, natural language processing is an umbrella term to explain the whole process of turning unstructured data into structured data. NLP helps technology to engage in communication using natural human language. As a result, we now have the opportunity to establish a conversation with virtual technology in order to accomplish tasks and answer questions.

NLU design model and implementation

This way, the sub-entities of BANK_ACCOUNT also become sub-entities of FROM_ACCOUNT and TO_ACCOUNT; there is no need to define the sub-entities separately for each parent entity. We also include a section of frequently asked questions (FAQ) that are not addressed elsewhere in the document. A dialogue manager uses the output of the NLU and a conversational flow to determine the next step. With this output, we would choose the intent with the highest confidence which order burger.

What is natural language understanding (NLU)?

This section provides best practices around creating artificial data to get started on training your model. In the data science world, Natural Language Understanding (NLU) is an area focused on communicating meaning between humans and computers. It covers a number of different tasks, and powering conversational assistants is an active research area. These research efforts usually produce comprehensive NLU models, often referred to as NLUs.

Understanding your end user and analyzing live data will reveal key information that will help your assistant be more successful. It breaks the train/test split that is recommended in data science, but in practice this is creating a rule set for your model to follow that’s effective in practice. NLU helps computers to understand human language by understanding, analyzing and interpreting basic speech parts, separately. NLU is an AI-powered solution for recognizing patterns in a human language.

Entity spans

To run the code you just need your dialogue manager key and a python environment. Once you clone the Github repository, the readme will update the steps on how to do so. We’ll split this section into a general interface portion, and a Voiceflow specific implementation. However, most NLUs don’t have built in functionality to run tests, so we have to write our own wrapper code, which we’ll cover in the this section.

NLU design model and implementation

I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more. The aim of this comparison is to explore the intersection of NLU design and the tools which are out there. Some of the frameworks are very much closed and there are areas where I made assumptions. Botium focusses on testing in the form of regression, end-to-end, voice, security and NLU performance.

Saga Natural Language Understanding (NLU) Framework

Whenever possible, design your ontology to avoid having to perform any tagging which is inherently very difficult. Designing a model means creating an ontology that captures the meanings of the sorts of requests your users will make. In this section we learned about NLUs and how we can train them using the intent-utterance model. In the next set of articles, we’ll discuss how to optimize your NLU using a NLU manager.

  • Create a story or narrative from the data by creating clusters which are semantically similar.
  • A regular list entity is used when the list of options is stable and known ahead of time.
  • It’s used to extract amounts of money, dates, email addresses, times, and distances.
  • With new requests and utterances, the NLU may be less confident in its ability to classify intents, so setting confidence intervals will help you handle these situations.
  • If you don’t have an existing application which you can draw upon to obtain samples from real usage, then you will have to start off with artificially generated data.
  • Note that if the validation and test sets are drawn from the same distribution as the training data, then we expect some overlap between these sets (that is, some utterances will be found in multiple sets).

This sounds simple, but categorizing user messages into intents isn’t always so clear cut. What might once have seemed like two different user goals can start to gather similar examples over time. When this happens, it makes sense to reassess your intent design and merge similar intents into a more general category. It is the only platform that provides access to different symbolic, hybrid and ML models, with full transparency and control over the design, development and deployment process. Leverage the full set of functionalities in any step of the workflow, from data ingestion and converting to tuning and development. An important part of NLU training is making sure that your data reflects the context of where your conversational assistant is deployed.

Run data collections rather than rely on a single NLU developer

Natural language processing has made inroads for applications to support human productivity in service and ecommerce, but this has largely been made possible by narrowing the scope of the application. There are thousands of ways to request something in a human language that still defies conventional natural language processing. “To have a meaningful conversation with machines is only possible when we match every word to the correct meaning based on the meanings of the other words in the sentence – just like a 3-year-old does without guesswork.” Integrating and using these models in business operations seem daunting, but with the right knowledge and approach, it proves transformative. Trained NLU models sift through unstructured data, interpret human language, and produce insights to guide decision-making processes, improve customer service, and even automate tasks. But with natural language processing and machine learning, this is changing fast.