NVIDIA registers the world’s quickest BERT training time and largest transformer-based model August 13, 2019 by admin Tweet The company’s immensely powerful DGX SuperPOD trains BERT-Large in a record-breaking 53 minutes and trains GPT-2 8B, the world’s largest transformer-based network, with 8.3 billion parameters. Read more… Neowin Related Posts:Foxconn is building the world's largest facility to…Steam's Planes, Trains, and Automobiles Fest sales…Mistral announces Large 2 flagship LLM with 123…Age is just a number for 30-year-old Windows 3.11…An-225 "Mriya," world's largest aircraft, is now…