Chatbot Development

Improving Your Chatbot Performance Through Chatbot Testing Strategy

Because A/B testing allows you to experiment with almost every aspect of an interface, it is an important tool for understanding the effectiveness of a given feature. A/B chatbot performance allows you to make smarter, data-informed choices that resonate with your users and help achieve business goals.

September 22, 2020

A/B testing allows you to experiment with almost every aspect of an interface, making it an important tool for understanding the effectiveness of a given feature. As a chatbot testing strategy, A/B testing allows you to make smarter, data-informed choices that can improve your chatbot’s performance.

Experimenting with A/B testing allows you to compare the performance of two different versions of experience on a website or an app. This method isn’t just for websites and apps anymore—you can use A/B testing as a chatbot testing strategy to measure and improve its performance.

There are three parts:

  • The problem. The issue you want to solve
  • The control. The existing feature or experience you’re testing against, without which, you wouldn’t know whether your experiment was successful
  • The proposed solution. The new approach you’ve come up with to solve the problem—your hypothesis

After you have those three things, the A/B test splits your traffic, showing the control to half of your visitors and showing the proposed solution to the other half.

Companies use A/B testing to drive decisions that increase conversion rates—that is, out of the total number of users, how many complete the desired action. For example, Fab used A/B testing to figure out what would happen if it added text to its cart button.

  • The problem. Not enough people were using the cart button.
  • The control. Fab’s existing cart button.
  • The proposed solution. Will adding text to the cart button get more customers to put products in their cart?

The experiment resulted in a 49 percent increase in the cart button’s click-through rate (CTR). Because of this outcome, Fab made an educated and a data-driven decision to add text to its cart button.

Companies all over the world use A/B testing to boost conversions and revenue. The beauty of A/B testing is that it encourages you to constantly tweak your chatbot for the better and to tackle challenges head-on.

With A/B testing, you can transform the way you develop and improve your chatbots. It allows you to challenge your assumptions and to back up your wild ideas with data. It’s the perfect marriage of creativity and insight. Read on to learn how you can use A/B testing as a form of chatbot performance testing to empower and improve your existing chatbot capabilities.

Why A/B testing matters for your chatbot

Chatbots on the frontlines of customer service determine the first impressions that consumers have of your business. And we’re all well aware of just how much first impressions matter.

According to research from Solvvy, 24% of customers who have a good first impression are likely to remain loyal for up to two years. In other words, the way that your chatbot interacts with people can make the difference between a loyal consumer and a one-time visitor.

You can use A/B testing to drive decisions that affect your chatbot design and chatbot conversations. When you know how chatbot design elements, physical location, and voice and tone affect your user, you can make better-informed decisions and, in turn, help your bot be more impactful.

A/B testing helps you understand how you can create a chatbot experience that pleases your users and grows your business. And it’s fun! A/B testing allows you to play mad scientist, except that, instead of building a Frankenstein monster, you get to tweak aspects of your bot until you create something users love.

How to choose an A/B testing tool

Just a few years ago, there weren’t many analytical tools and platforms for chatbots. But now, there are a few chatbot-focused analytical companies, giving you more options for A/B testing. To help you get started, we put together a list of pros and cons of popular platforms that can help you run A/B tests and evaluate some KPIs to measure chatbots.



With Botanalytics, you’re able to analyze conversations that people have with your bot. The company’s goal is to help you build better conversational flows.


  • Integrates with all platforms
  • Uses sentiment analysis and conversational data analytics backed by machine learning
  • Filters user conversations based on how they interact with your bot
  • Tracks goals with funnels and event segments


  • Setup can be difficult for non-coders



Botlytics lets you keep count of all the messages your bot sends and receives.


  • Works with any platform
  • Uses queries to track messages that have certain keywords


  • Doesn’t give you much insight compared to other platforms



Chatbase is a chatbot analytics platform made by Google.


  • Creates custom funnels with user intent in mind
  • Shows active users, sessions, and retention
  • Uses session flow to help figure out when users are exiting your bot


  • Difficult to use for non-coders



Dashbot is an analytical tool for conversational interfaces, like Facebook, Alexa, Google Home, Slack, and Kik.


  • Integrates with many platforms
  • Shows sentiment analysis, conversational analytics, Slack teams, and multi-user sessions
  • Offers funnel analysis
  • Provides real-time transcripts of conversations
  • Stores every conversation between user and bot


  • Only available for conversational interfaces


Facebook Analytics for Messenger bots

Facebook has a built-in tool for measuring the effectiveness of a Facebook Messenger bot.


  • Allows customers to rate your bot and to leave reviews
  • Monitors spam and block rates
  • Integrates with your existing analytics provider
  • Shows comprehensive messaging metrics


  • Only works on Facebook Messenger


How to analyze results of your A/B test

A/B chatbot performance testing can answer a wide range of questions. The most common question is, “Which version has a higher conversion rate?” But don't test things just for the sake of testing. Use thought-out hypotheses, and have a focused metric you are looking to move for your chatbot. Make sure to track all KPIs, because many of your discoveries may not be the ones you were originally looking for.

As a chatbot owner, you should be tracking the following metrics to get a good sense of how your bot is performing. This information is helpful when you’re conducting other tests and experiments, too, even beyond A/B tests.

These bot performance metrics give you actionable insights to help you create a better bot experience for your users:

  • Active users. People who read a chatbot message within a specific time period
  • Engaged users.People who are conversing with the chatbot
  • New users.The number of new users using your chatbot
  • Total conversations. The conversations started and completed in a day (a metric commonly used in ecommerce)
  • Total users. The number of people using your chatbot

You also want to measure messages sent by the bot, messages sent by the user, and new conversations. When conducting A/B tests, pay close attention to the following metrics:

  • Activation rate. When a user responds to your chatbot with a relevant question or answer
  • Confusion triggers. The number of times your chatbot misinterprets a message or fails to understand a message
  • Fall back rate (FBR). How often your chatbot fails a task
  • Goal completion rate (GCR). How well your chatbot engages with users
  • Retention rate. The number of users who return to a chatbot within a specific time period (varies, depending on the analytics tool you’re using)
  • Self-service rate. How well a chatbot can resolve a request without human intervention
  • User satisfaction. Surveys for a good understanding of how pleased a user is with your chatbot’s performance

Using one metric to test your chatbot gives you a limited view of your chatbot’s performance and behavior. Using several KPIs to measure your chatbot helps you determine the effectiveness of the solution you’re testing.

How to learn from A/B testing insights

A/B testing allows you to investigate your assumptions about how you can improve the experience for your users. Plus, the result of one A/B test feeds into the next A/B test and so on.

For example, let’s say your bot has a recommendation feature. You conduct an A/B test to see whether it’s better to recommend multiple products at once or one product at a time. The experiment shows that your users prefer to see multiple products at a time.

Your next A/B test should build on the insight gained from the first experiment. Because you know that your users prefer to see multiple products at the same time, your next test can focus on the way those products are organized. You could set up one test to display similar product with best reviews. Meanwhile, the other test would display products that are frequently purchased with the item selected by the user. Run A/B tests often so that you have a constant stream of feedback on the decisions you make.

How to make sure your A/B test is statistically significant

When undergoing chatbot performance testing, it’s not enough to run an A/B test for a day or two to figure out what changes you need to make. Running your A/B experiment for a short amount of time skews your results. You need to make sure your A/B test insights are reliable. That’s where statistical significance comes in.

Statistical significance is defined as “a way of mathematically proving that a certain statistic is reliable. When you make decisions based on the results of experiments that you’re running, you will want to make sure that a relationship actually exists.”

The length of your A/B test depends on its significance level. A significance level of 95 percent or higher is a good goal. The higher your significance level, the better, because it means there’s less chance for error. So, if you run an A/B test and get a significance level of 97 percent, it means that you only have a 3 percent chance of error. You can use a calculator to find out the significance level of your A/B test.

The sample size of your A/B test matters, too. You don’t want a small pool of people to determine major decisions. The ideal A/B testing sample size depends on your audience and goals, but a pool of at least 1,000 people is recommended.

A/B tests are a fun and powerful tool that can help you make important decisions. With the right analytics tool, metrics, and approach, they can be used as a chatbot testing strategy to improve chatbot performance, empowering you to delight your users and reach your business goals.