Content area

Abstract

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts on both user and system sides. About 60% of the dialogues have cross-domain user goals that favor inter-domain dependency and encourage natural transition across domains in conversation. We also provide a user simulator and several benchmark models for pipelined task-oriented dialogue systems, which will facilitate researchers to compare and evaluate their models on this corpus. The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.

Details

1009240
Business indexing term
Identifier / keyword
Title
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Volume
8
Pages
281-295
Publication year
2020
Publication date
2020
Publisher
MIT Press Journals, The
Place of publication
Cambridge
Country of publication
US Minor Outlying Islands
Publication subject
ISSN
2307387X
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Milestone dates
2019-10-01 (Received); 2020-01-01 (Revision Received); 2020-01-01 (Publication Date)
ProQuest document ID
2893885784
Document URL
https://www.proquest.com/scholarly-journals/crosswoz-large-scale-chinese-cross-domain-task/docview/2893885784/se-2?accountid=208611
Copyright
© 2020. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-11-08
Database
ProQuest One Academic