Document Preview Unavailable
MixKD: Towards Efficient Distillation of Large-scale Language Models
Liang, Kevin J; Weituo Hao; Shen, Dinghan; Zhou, Yufan; Chen, Weizhu; et al. arXiv.org, Mar 17, 2021.You might have access to this document
-
Try and log in through your institution to see if they have access to the full text.
Log in through your library