Document Preview Unavailable

SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification

Miao, Xupeng; Oliaro, Gabriele; Zhang, Zhihao; Cheng, Xinhao; Wang, Zeyu; et al.  arXiv.org, Apr 1, 2024.

You might have access to this document