Bots for Business, Conversational Marketing
Evaluating the Usability of a Healthcare Chatbot
A team of researchers at Ulster University in Northern Ireland has recently carried out a study exploring ways in which developers can assess the usability of healthcare chatbots.
By Ingrid Fadelli
October 15, 2019
Over the past few years, chatbots have become increasingly popular, as they can offer valuable guidance, assistance, and information almost instantly. Evaluating how easy it is for humans to use chatbots, however, can be quite challenging. This is mainly because the tools typically used to assess the user experience (UX) of applications are usually not applicable or ideal for testing chatbots.
With this in mind, a team of researchers at Ulster University in Northern Ireland has recently carried out a study exploring ways in which developers can assess the usability of chatbots. They specifically investigated this by trying to assess user experiences of a healthcare chatbot they developed, called WeightMentor, which is a motivational tool to assist individuals with weight loss maintenance.
The power of motivation
Dr. Anne Moorhead, one of the researchers involved in the study, has been pursuing research related to obesity and health for many years. Recently she has been investigating how motivational messages can persuade and encourage people with obesity to develop healthier eating and lifestyle habits.
Can a bot replace a dietician?
“The traditional technology methods for sending motivational messages was through SMS and text messaging,” Dr. Moorhead told discover.bot. “However, with technology advancing and with the emergence of chatbots, there became a new opportunity to evolve this research into motivational conversations via artificially intelligent conversational agents.”
The findings she gathered in her past studies inspired Dr. Moorhead to develop a chatbot, called WeightMentor, in collaboration with PhD researcher Sam Holmes, Dr. Raymond Bond, Prof. Huiru Zheng, Prof. Vivien Coates, and Prof. Michael McTear at Ulster University. WeightMentor uses natural language processing (NLP) in the cloud to manage conversations between a human user and a computer.
Integrating social media
Users communicate with WeightMentor via Facebook Messenger, thus they can initiate a dialogue with the bot from their smartphones, computers, and a variety of other devices. The healthcare bot is designed to imitate human dialogue styles, to make conversations with users more credible and engaging.
“Perhaps this deceives the user to the point where they treat the chatbot as a human and not a computer even when the computer fully discloses that it's not a human,” Dr. Bond, another researcher involved in the study, told discover.bot. “To avoid ambiguity or mistakes in the dialogue, we rely heavily on the computer leading the conversation.”
Quick replies
The healthcare bot developed by the researchers also features quick replies, which essentially allow a user to select their response to a question out of a series of options. For instance, when the bot asks a user what they would like to do, the user can select an option from features supported by the bot out of a list. According to the researchers, this quick reply option is not necessarily limiting, as it can also expedite the conversation and direct users more rapidly to the tool they are looking for.
“Our chatbot also has a personality,” Dr. Bond said. “For example, we designed the chatbot to exhibit humor to increase engagement and we even use animated GIFs to simulate human-like multimedia conversations that you get today on common messaging applications, ensuring that the user does not get the same GIF twice.”
AI can help keep users motivated.
Engaging dialogue
To make dialogues with users more realistic and engaging, the researchers used a backend database to log user interactions with the chatbot, in order to prevent it from always using the same expressions and GIFs. This database essentially acts as WeightMentor’s long-term memory, ensuring that it doesn’t use the same GIF more than once, thus introducing greater variability in the conversation.
“This is important because in real life, for example, the same person might change how they greet people, hence variability is likely to increase engagement and avoid being too robotic,” Dr. Bond said.
Facilitating longterm changes
WeightMentor is ultimately designed to help people who are overweight to become more aware of their eating habits, encouraging them to consume healthier food and make wiser lifestyle choices. For instance, the bot asks users to input their energy intake and expenditure on a crude subjective scale. It then stores this data in its long-term memory and offers regular feedback about the user’s behaviour and patterns displayed on a graphical time series plot.
“This allows the user to discover patterns about themselves, since we as humans do not remember what our calorie intake was yesterday—let alone last week—and we are certainly unaware of any fluctuations or periodicities,” noted Dr. Bond. “This visual feedback is a kind of quantified self enabling the user to set goals that they want to reach.”
The key feature of WeightMentor inspired by Dr. Moorhead’s research, however, is its ability to generate and send motivational messages. These messages are meant to support people in their path towards losing weight, encouraging them to make healthier eating choices in their day-to-day life.
“The motivational messages are tailored based on the user’s state and context in the conversation,” Dr. Bond said. “This is driven using a smart search of a database of messages that are semantically organised for rapid extraction for use in a relevant user context.”
Testing reliability
In their recent study, the researchers wanted to test the usability of the WeightMentor chatbot in a reliable way. To do this, they developed a new questionnaire for testing chatbot usability and compared its results with those obtained using conventional usability metrics.
Focus was put on ease of use and conversation flow.
The usability testing protocol adopted by Dr. Moorhead, Dr. Bond, and their team focuses on a series of key aspects, including how easy it was for a user to complete a task, unique usability errors that showed up, and what users thought about the bot. The results of their usability test suggest that WeightMentor is very easy to use, while also highlighting some of the advantages of chatbots overall.
“Firstly, we found that chatbots are innately user friendly,” Dr. Bond said. “Chatbots are arguably more user friendly compared to graphical user interfaces because they do not have cumbersome menus, multiple navigation bars, and graphical items that compete for our attention. They are also very easy to use, as they communicate using natural language and they only require the user to read one utterance at a time and respond, as opposed to making sense of a cluttered graphical user interface.”
Moving forward
The usability study carried out by Dr. Moorhead, Dr. Bond, Mr. Holmes, and their colleagues suggests that a user’s computer literacy may have very little influence on their ability to engage with a chatbot. In fact, in contrast with other technological tools or platforms, chatbots are quite straightforward and intuitive, as users simply need to type messages and read the bot’s responses.
“We also investigated whether traditional usability measurement tools can be used to assess the usability of chatbots,” Dr. Bond added. “While researching usability testing methods, we found impetus to design a new tool to specifically assess the usability of a chatbot, as most usability instruments today were designed specifically for assessing graphical user interfaces and traditional software systems.”
The researchers validated the feasibility and effectiveness of their usability testing tool, called the Chatbot Usability Questionnaire, in a series of different scientific tests. They will be publishing the results of these validation studies in a forthcoming scientific paper.
Dr. Moorhead, Dr. Bond, and the rest of their team are now planning to conduct a trial of WeightMentor to measure how effective it is in helping people to lose weight, as well as to maintain or sustain weight loss over time.
References
Holmes S., A. Moorhead, R. Bond, H. Zheng, V. Coates, and M. McTear. “Usability testing of a healthcare chatbot: Can we use conventional methods to assess conversational user interfaces?” Proceedings of the 31st European Conference on Cognitive Ergonomics, September 10–13, 2019. https://dl.acm.org/citation.cfm?id=3335094