How to Perform a Chatbot Review
Reviewing chatbots is part of our job.
We obviously have to make sure our own work is up to our standards. We are strict with ourselves and reviewing our chatbots is a big part of the client handover.
On the other hand, we also need to keep up with the competition. Reviewing chatbots they build is important in that regard.
Being able to review chatbots is a valuable skill for you to obtain. Whether you want to keep a pulse on the industry's big players or be able to check up on your chatbot company's work, the following will help.
The professional chatbot review checklist
Editor's note: After countless requests from our readers, we decided to release the internal checklist we go through when reviewing chatbots. We, along with our clients, use this checklist to audit the chatbot solutions their competitors may have put out there and inform the own feature planning.
Our worksheet and its associated 'how-to' document are now available.
Grab the Professional Chatbot Review Checklist
Now, back to the article!
Basic vs. advanced
Before we jump in, I want to make it clear this is an introductory article to reviewing chatbots. There is a lot that goes into performing QA on AI-driven machines (as you can imagine).
You will see I introduce a rating system for each section. You will give a number out of five to each and make simple math to determine the quality of the chatbot you are reviewing.
In the real world, the world of leading chatbot companies, we not only have more sections to review, we also have a unique weighting system. A 1/5 score in NLU is not equal to a 1/5 score in objective effectiveness or a 1/5 in conversational UX.
Now that we're clear on this, let's get started.
Our more avid readers will notice a similarity between this article and this one. Planning the best chatbot should, indeed, give you a decent chatbot which should pass all our tests.
The first factor we are looking at is the simplest and most important one: does it do its job? Is it achieving its one true goal? If yes, how effectively is it achieving it?
A chatbot's one true goal is something most developers and companies forget about. There is a lot we can achieve with chatbots and it is easy to get distracted. In this section, you must go back to the root of the purpose of the chatbot.
Domino's chatbot OTG (One True Goal) is to sell pizzas. That's it. Is it achieving that? Is it doing a good job whilst achieving it?
Natural language understanding (NLU)
I'm going to skip right over natural language processing and straight into NLU.
As a reminder, NLU is a subset of NLP. The difference is complex and outside of the scope of this article. In the simplest terms, NLP refers to all the systems that allow machines and humans to communicate. NLU is a subset of that domain and refers to the machine's capacity to understand human's input (text, voice, etc.), including mispronounciations, spelling mistakes, weird phrasing, etc.
In our basic chatbot review, we want to evaluate the NLU of the chatbot. Is it properly understanding what we are saying? How easily can you make it revert to linguistic deflections?
Talk the to chatbot about what it is supposed to know. Keeping the conversation on topic is important here. If I started talking to the Domino's chatbot about farming in Eastern Europe, I'd easily 'break it' -- but that's not the point.
Proper bells and whistle use
Most platforms chatbots live on have fun display elements. Think of Facebook Messenger and their carousel.
Visual features often help the user find their way through the chatbot. Proper use of buttons can improve conversational UX a great deal. In this section, you want to review these features.
Is the chatbot all text? Is it using visual elements like images, GIFs, videos, buttons, slides? If it is, do these elements make sense?
You can easily picture chatbots that should not include fancy stuff (e.g. a chatbot that helps you fill out your taxes). The key here is to evaluate the proper use of these features, not whether the chatbot has them or not.
Conversational UX is one of the most interesting topics to come out of this whole chatbot malarky -- at least to me. It refers to the way conversations flow between the chatbot and its users.
As you can imagine, there is a lot that brands can do. It opens a wide array of opportunities for brands to differentiate themselves from their competitors.
For this section, you are going to look at the conversation flow. Does it all make sense? Is the chatbot taking you down a path that is relevant to what you are trying to achieve? Can you divert in the middle of a conversation? Can you change the topic?
You want to evaluate not only the effectiveness of the chatbot but also its language. Is it using a language that is appropriate for your conversation? Is it using a particular tone that makes sense? Does it sound robotic?
Chatbot review: scoring and evaluation
You have gone through the basic sections we use to review chatbots. Well done. You should now have a score out of five for each of these. A quick addition will give you the chatbot's final score.
Here are a few notes to keep in mind:
- This is very basic. There is a lot that goes into reviewing a chatbot properly. This should give you a great head start, though.
- Scoring is hard. What makes good conversational UX for you might not do for your colleague. Keep that in mind.
- Chatbots are iterative creatures. Don't be too harsh on a company based on their chatbot's performance. Chances are they are still iterating through different phases.
If you are reviewing your company's own chatbot, make sure you user test it. Internal testing is never enough. Good luck, and send us your best (or worst) chatbots!