Document Preview Unavailable

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

Shoeybi, Mohammad; Patwary, Mostofa; Puri, Raul; LeGresley, Patrick; Casper, Jared; et al.  arXiv.org, Mar 13, 2020.

You might have access to this document