Microsoft's new AI chip, the Maia 200, is set to launch this week in a data center located in Iowa, aiming to enhance the performance of AI services amid rising costs associated with inference. This second-generation chip is specifically designed for delivering AI responses, shifting the focus from previous models that primarily targeted training. The launch responds to the increasing demand for efficiency as AI chatbots and digital assistants grow in popularity.
Building on the earlier Maia 100, the Maia 200 boasts over 100 billion transistors and can achieve more than 10 petaflops of computing power at 4-bit precision, with approximately 5 petaflops at 8-bit precision. These specifications are crafted for real-world applications, emphasizing speed and energy efficiency, which are crucial for handling modern AI workloads.
Additionally, a second deployment of the chip is planned for Arizona. As Microsoft aims to lessen its dependence on NVIDIA's hardware, the Maia 200 positions the company competitively against other cloud providers like Google and Amazon Web Services, who have introduced their own AI chips. The design prioritizes rapid response times, particularly during high user traffic, aligning with current trends in AI hardware development.