Microsoft just launched an AI diagnostic tool that beats human doctors at diagnosing disease — by a huge margin. The system, called MAI Diagnostic Orchestrator (MAI-DxO), achieved 80% accuracy on medical cases compared to just 20% for human physicians.
The launch follows tests using 304 case studies from the New England Journal of Medicine. MAI-DxO mimics a team of doctors by querying top AI models like OpenAI’s GPT, Google’s Gemini, Anthropic’s Claude, Meta’s Llama, and xAI’s Grok. This “chain-of-debate” approach leads to more precise diagnoses at 20% lower cost, thanks to selecting cheaper tests.
Mustafa Suleyman, CEO of Microsoft’s AI division, called the system “a genuine step toward medical superintelligence.”
Microsoft’s project recruited top AI talent, including former Google AI researchers, signaling the fierce competition for AI experts.
The company has not yet decided whether to commercialize the tech. It might integrate the system with Bing for user self-diagnosis or build tools to aid doctors.
Dominic King, Microsoft VP on the project, said the model is both accurate and cost-effective.
"This orchestration mechanism—multiple agents that work together in this chain-of-debate style—that’s what’s going to drive us closer to medical superintelligence,”
Mustafa Suleyman stated."Our model performs incredibly well, both getting to the diagnosis and getting to that diagnosis very cost effectively,"
Dominic King said.
This approach improves on earlier AI diagnostic work by replicating how real doctors analyze symptoms, order tests, and refine their diagnosis step-by-step. It could also help cut US health care costs if adopted widely.
Expect Microsoft to expand real-world testing soon as it explores how far this AI-powered diagnosis can go.