Document Preview Unavailable
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
Miao, Xupeng; Oliaro, Gabriele; Zhang, Zhihao; Cheng, Xinhao; Wang, Zeyu; et al. arXiv.org, Apr 1, 2024.You might have access to this document
-
Try and log in through your institution to see if they have access to the full text.
Log in through your library