KeyReply Blog

Get productive with the KeyReply team

How should bots speak to humans?

TL;DR —Designing great bot dialog is tough but crucial to increase engagement. We talked to over 200 Facebook Messenger bots to extract conversation dialogs for your reference: BotSpeak. Evaluating the output from bots, we adopted a pure data approach for lexical improvements and anecdotal advice on better presentation of the bot conversations.

BotSpeak database — 1 human, 200+ bots, 892 bot dialogs

How do you feel about conversations with bots so far? Are they engaging, disappointing or interesting?

For bot makers, there are lots of challenges in crafting more engaging bots. Among others, these two are the most obvious and painful:

Challenge 1: Getting inspiration for how to build your bot based on how others built theirs. You can’t just search for bots based on the meta details in conversations, because there isn’t such a database.

Challenge 2: Ensuring that the bot dialog is great and captivating. Other than what you know/think your audience will like, there’s no easy way to see best practices across a wide range of bots.

As you can imagine, it’s pretty hard to get to know other bots that well unless you spend lots of time talking to bots, and have your Messenger inbox entirely flooded with bot conversations.

So, in order to tackle these two problems head-on, we talked to 200+ bots for research, and documented the conversations. We called this project BotSpeak, and we present all the conversations we had on BotSpeak in a searchable table format for you, so you don’t have to, yourself. Enjoy the 🎁!

You can contribute to BotSpeak database by inputting your bot’s conversations of up to 8 outputs here: https://goo.gl/forms/TFy2UTEdSAFS3ZzN2

Bot selection

We chose bots based on their popularity, based on total views of the bot on Botlist. Other than top bots by views, we also selected bots across a range of categories to ensure that most of the major types of bots are represented in this study. This ensures that the sample is not only comprehensive across categories, but also filtered by popularity; the assumption is that popular bots have more chance to develop best practices over time.

Bot category classification is taken from Botlist for each bot we tested

As you can see, “Personal” bots are heavily represented due to the proportion of popular bots that are classified as such.

Methodology

Every research begins with scoping down a right methodology. In order to do this right, we came up with a bunch of parameters which we feel would be really useful to analyze. The starting point of the study was based on textual data from our conversations with bots. To get the most meaningful data, we engaged the bots in conversation, pressed buttons to receive output for record and analysis, and went through every bot experience in general.

Collection of data

  • As there are no publicly available sources of data for bot conversations, we started out by using the collection of bots on Facebook Messenger indexed on Botlist (great database and resource for discovering the best bots across all platforms).

  • Knowing that we might be talking to over 200 bots and Facebook might classify the activity as suspicious, we created a new account to talk to the bots. After going through about 20 conversations, we realized that Facebook has blocked my account from sending new messages to bots. Hence, I used that as a cue to 1) start recording the conversations, 2) deactivate the account, 3) create a new account to talk to new bots.

  • For this exercise, we had to create about 10 new accounts and deactivated as many. (Sorry Facebook, but it probably took no notice to this small blip in user churn anyway.)

One of the accounts which we spun up to test bots and shut down after

8 accounts created for bot testing under — Brian, Doug, Connor, Mathew, Al, Wayne to name a few

  • In recording the conversations, we ignored the cards, images and media that the bot was displaying in the analysis records to focus on the text. The reason we approached it in this manner is because we felt that the creatives are unique to brands and bot developers should use brand- or use case- appropriate visuals. Plus, it would complicate the recording on a Google Sheet table! However, we did note some general comments on creative presentation and you can read about this later on.

  • As much as possible, we tried not to use too much free-form/natural language to allow the bots to provide us with the ideal experience as defined by the developer. Being harsh to bots is definitely not part of this exercise for recording and discovering best practices; but, in our own experience, some people are mean to bots.

Analysis of data

  • After the data was collected and organized in a table structure, we began analyzing the conversations in bulk with text analysis tools.

  • We visualized the data for anyone to be able to use and access on BotSpeak. There are filters for names and categories, which you can use as terms to search for what you need. Since categories were set by the developer (or Botlist posters), we chose not to reclassify the bots despite some cases where categories were not representative of the bot.

BotSpeak database of bot conversations logged

  • Limitations — There are many well-designed bots with multiple paths and possibilities. Due to time constraints, we did not do a full audit of all paths and may have missed some of them. In the future, we hope to dedicate more resources to obtain more complete “Bot maps”. This limitation is especially so for story bots, as scripts can go up to 2500 dialog outputs and so we have limited them to ~8 outputs. (Besides, you should try them yourself, most of them are pretty fun.)

  • For the purpose of this study, we felt that the number of conversations and data collection was sufficient to form a robust initial representation set for use.

Areas of improvement: Lexical/Semantics

Based on all the conversations with bots we’ve analyzed, here are some semantics-related improvements we feel could significantly improve bot experiences. Each of these points are also marked with the total observed instances across bots.

A. Take note of extra words / Modifier phrases / Grammaticality (347 instances)

E.g.: “Red sunglasses are very trending, would you like to see some?” — can be edited to be: “Red sunglasses are trending, want to see some?”

B. Avoid rambling starts (44 instances)

E.g.: “To become a wiser human being by reading the secrets posted by my friends, please tap once on the menu button shown below and select READ A SECRET” — can be edited to be: “Become a wiser by reading secrets from friends by tapping READ A SECRET below”

C. Use emojis for engagement (119 instances)

E.g.: “Sorry, didn’t get it” — can be edited to be: “Sorry, didn’t get it 😕”

D. Consider using active voice instead of passive voice (89 instances)

E.g.: “You have been matched to Colin.” — can be edited to be: “I’ve matched you to Colin.”

E. Replace rare/complex words with simple words (146 instances)

E.g.: “The least agonizing itinerary flies from SJC to HNL.” — can be edited to be: “The least painful itinerary flies from SJC to HNL.”

Areas of improvement: Presentation

The visual cues of a bot are just as important as the script and functionality. The bots that we talked to had some areas where they could improve in terms of visual hierarchy, contrast, and appropriateness. (These are our observations, and not part of the Botspeak database.)

A. Use buttons where you can to aid users

People enjoy doing easy things over hard things. Hence, buttons can (and usually is) the preferred mode for people interacting with bots. Instead of the hassle of typing common requests like “Yes” or “What else can you do”, you can just press a button. It’s also easier to lead users down the experience, constrained by the buttons shown.

B. Take advantage of the Messenger platform’s functionality

Options like quick replies and logins can be used to enhance the ease of using the bot. Wherever it’s possible to simplify choices to just a few clickable options, you should do that. If there’s a potential of streamlining the experience with customer details through login, you may want to try it. There are lots of other functionalities you can choose to use, so explore and make full use of them, where it makes sense.

C. Use attractive menu images

Many bots are heavily dependent on their menus, and this means that most users will end up seeing the menu about 30–50% of the time. Hence, menu images should be representative of your bot (and the company or product it represents), and made to be as attractive as possible.

D. Send interesting images as part of your responses

It can be really interesting for users when you send creative visuals as an attachment together with bot responses. You can even use gifs as part of your quick replies, adding small points of delight within your bot.

Poncho is a good source of inspiration for delightful visuals

E. Make sure you have a welcome message

Only 70% of bots we tried included a welcome message. What happened to everyone else? With bots still a relatively new interface for the majority of (non-tech, non-startup) users, it’s important to present the bot and its functionality clearly, so users are properly guided through the experience and achieve success in what they were trying to do.

Having a good, well-crafted welcome message sets your bot users up with the right expectations, so they know what not to expect from the bot, preventing disappointment or perceived failures.

Main takeaways

  1. Testing is crucial: You’ll need to test and continuously improve the bot scripts based on feedback from your users.1 Lots of bots are not designed to handle common edge cases, such as users swearing at bots (more common than we’d like to think) and common backstory questions like “Who made you?”

  2. Cross-category learnings are applicable: A database like BotSpeak is useful for developers designing bot dialogs as there are many common use cases, and lots of things can be applied from other bots. For example, a productivity bot can easily learn a few tricks from a social bot in scripting fun experiences.

  3. People like simplicity & familiarity: The best bots are designed with users in mind, and with bots supposed to be the user-friendly and easy conversational interface it is, stick to easy and familiar words. The most used words by bots are visualized below (after removing stop words):

Word cloud of non-common words most used by the 200+ bots — fashioned as R2D2

How to use BotSpeak

You can leverage the knowledge we’ve gathered on Botspeak to filter by categories or the bots that you want to reference.

With this, you can look at examples of how other bots have been scripted and what their various output options are. You can use this as a guide for your own bot planning process.

Also, a simple CMD+F (Ctrl+F for windows) can get you a search by actual dialog words—for example, I looked for “book” and found the ShelfJoy bot delivering hand-curated book suggestions.

Searching by name of bot on BotSpeak

Future work

How should bots behave?

Options: Mimic human behavior (e.g. delays in conversation, having a name, being funny, using active voice) or being a more command-line experience (i.e. instant response, organization/product name identity, using formal tone/words). From the BotSpeak database, one can find that out of the 200 bots, about 10% of the bots told users their name.

Other platforms

We want to continue updating the database with additions from contributors, as well as adding other platform bots in the future. One obvious platform that we wish to figure out how to test is Slack. Given that a free team may not be best way to do so, we’ll have to find a creative (but still reasonable) methodology to use. Perhaps the extension of the project could be used by Slack as a way to advise users how the bot flows should look like.

Comprehensive data on flows

Interaction steps for this project is limited by the amount of time we spent with each bot. Some bots have over 100 programed interactions as part of a story, while others are focused on a search experience that ends when user demands are fulfilled. Over time, we will get a better idea how to craft the ideal experience for each category with data.

All a-bot the future

We are definitely only beginning to test and learn from different use cases of bots across multiple verticals. There are lots of other examples that you can see on BotSpeak, so go ahead and explore them yourself. With a database like this, which we hope you’ll add to, we can all start developing a sense of what would work best for bot-human interactions.

Keep improving your bots and if you like BotSpeak, share the link (and this piece) with your users and friends!

Shoutout to Joseph Tyler (Linguist), Erik Nilsen (Bot developer), Ben Tossell (Botlist), Philippe Dionne (Dialog Analytics) for reviewing this post and tool!

P.s. We also have another resource for building an enterprise chatbot strategy, which you can find here: https://keyreply.com/#ebook