Content area
Full Text
Abstract
Consumers are increasingly using the web to find answers to their health-related queries. Unfortunately, they often struggle with formulating the questions, further compounded by the burden of having to traverse long documents returned by the search engine to look for reliable answers. To ease these burdens for users, automated consumer health question answering systems try to simulate a human professional by refining the queries and giving the most pertinent answers. This article surveys state-of-the-art approaches, resources, and evaluation methods used for automatic consumer health question answering. We summarize the main achievements in the research community and industry, discuss their strengths and limitations, and finally come up with recommendations to further improve these systems in terms of quality, engagement, and human-likeness.
(ProQuest: ... denotes formulae omitted.)
INTRODUCTION
In traditional web search, the user inputs a keyword-based query to a search engine, and the search engine returns a list of documents, which likely contain the answer the user is seeking. However, having to traverse through a long list of records to find the desired answer is cognitively demanding. Sometimes, users need to reformulate the query several times until what they seek matches the domain's vocabulary. Automatic question answering (QA) addresses these problems by providing direct and concise answers to user queries expressed in natural language. Additionally, it checks spelling mistakes and reformulates the queries to reduce semantic ambiguities. However, this task is challenging as natural language is often ambiguous. Additionally, constructing a response requires a detailed understanding of the question asked, expert domain knowledge, and automatic ways to generate text using, for example, language generation models (Datla, Arora et al. 2017).
Automatic QA can be general or domain-specific, also known as open- and restricted-domain QA. Open-domain QA answers factoid or non-factoid questions from a wide range of domains, while restricted-domain QA answers questions from a specialized area, using domain-specific linguistic resources that enable more precise answers to a given question (Olvera-Lobo and Gutiérrez-Artacho 2011). Both open- and restricted-domain QA can take on either single- or multiturn conversational manners. Single-turn QA focuses on answering questions in a one-off fashion, that is, one question and one answer at a time. In this, the context of one problem is not carried over to another. Therefore, the system understands...