EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

Yuxian Gu,Jiaxin Wen,Hao Sun,Yi Song,Pei Ke,Chujie Zheng,Zheng Zhang,Jianzhu Yao, Lei Liu,Xiaoyan Zhu,Minlie Huang

arxiv（2023）

引用 33|浏览153

暂无评分

摘要

Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

查看译文

关键词

Natural language processing,deep learning (DL),large-scale pre-training,dialogue systems,Chinese open-domain conversational model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要