Nvidia announces TensorRT 8, slashes BERT inference times down to a millisecond July 21, 2021 by admin Tweet Providing over twice the precision and inference speed compared to the last generation, Nvidia’s new TensorRT 8 deep learning SDK clocked in a time of 1.2 ms in BERT-Large’s inference. Read more… Neowin Related Posts:Cerebras launches the world's fastest AI inference,…NVIDIA announces TensorRT-LLM for Windows that…NVIDIA announces GTC 2024 will be held on March 18,…Nvidia now selling the RTX 4060 on power efficiency,…NVIDIA revenues fell 13% in Q1 compared to the year before