Content area
Full Text
ABSTRACT
The past few years popularity of chatbots is constantly growing, and companies have been focused on developing them more then ever. Therefore, it is not surprising to see the news about various aspects of chatbot, from design and development to commercialization and marketing, are being daily published. Nevertheless, the topic of the chatbot evaluation is very often neglected. The metrics that should be used to evaluate the success of a chatbot are not systematized nor unified. One way to solve this problem is to align the metrics to the different perspectives of the chatbot evaluation: user experience perspective, information retrieval perspective, linguistic perspective, technology perspective and business perspective. In order to build the evaluation framework, the following categories of chatbot should be analyzed: usability, performance, affect, satisfaction, accuracy, accessibility, efficiency, quality, quantity, relation, manner, grammatical accuracy, humanity and business value. This paper provides a review on the evaluation metrics available for measuring success of efforts invested in chatbot, and proposes the chatbot evaluation framework based on five perspectives. The contribution of this paper is to help researchers to identify opportunities for the future research in evaluation of chatbot performance.
Keywords: chatbot, chatbot assessment, chatbot performance, evaluation metrics
1.INTRODUCTION
"Chatbots are one specific type of conversational interface with no explicit goal other than engaging the other party in an interesting or enjoyable conversation." (Venkatesh et al., 2018) Chatbots can be used in numerous areas and for various purposes - from customer service to education and entertainment, and from personal to professional purposes. They are usually embedded in chatting applications or webpages, thus enabling simple completion of tasks through conversation with user. Some of the most famous chatbots are Watson, Siri and Messenger. Chatbots have recently become the focus of academic and industrial research due to several reasons, including the rise of digital assistants and socialbots, but also advances in artificial intelligence, machine learning and related technologies. According to a recent research (Radziwill & Benton, 2017), during the last 10 years chatbots were involved in more than a third of online conversations. Evaluation of chatbots is a challenging research problem that lacks a unique and widely accepted metric, and has remained largely unsolved. Many of the existing studies about chatbots are based on the technical...