NVIDIA registers the world’s quickest BERT training time and largest transformer-based model August 13, 2019 by admin Tweet The company’s immensely powerful DGX SuperPOD trains BERT-Large in a record-breaking 53 minutes and trains GPT-2 8B, the world’s largest transformer-based network, with 8.3 billion parameters. Read more… Neowin Related Posts:Nvidia announces TensorRT 8, slashes BERT inference…Age is just a number for 30-year-old Windows 3.11…Amazon trains Alexa to spit out ads with answers to…An-225 "Mriya," world's largest aircraft, is now…Amazon is working on two custom generative AI chips…