Microsoft unveils Project Brainwave for real-time
Post# of 22454
Even on early Stratix 10 silicon, the ported Project Brainwave system ran a large GRU model—five times larger than Resnet-50—with no batching, and achieved record-setting performance. The demo used Microsoft’s custom 8-bit floating point format (“ms-fp8”), which does not suffer accuracy losses (on average) across a range of models. We showed Stratix 10 sustaining 39.5 Teraflops on this large GRU, running each request in under one millisecond . At that level of performance, the Brainwave architecture sustains execution of over 130,000 compute operations per cycle, driven by one macro-instruction being issued each 10 cycles. Running on Stratix 10, Project Brainwave thus achieves unprecedented levels of demonstrated real-time AI performance on extremely challenging models. As we tune the system over the next few quarters, we expect significant further performance improvements.
https://www.microsoft.com/en-us/research/blog...brainwave/