Each year, millions of Americans walk out of a doctor’s office with a misdiagnosis. Physicians try to be systematic when identifying illness and disease, but bias creeps in. Alternatives are overlooked.
Now a group of researchers in the United States and China has tested a potential remedy for all-too-human frailties: artificial intelligence.
In a paper published on Monday in Nature Medicine, the scientists reported that they had built a system that automatically diagnoses common childhood conditions — from influenza to meningitis — after processing the patient’s symptoms, history, lab results and other clinical data.
The system was highly accurate, the researchers said, and one day may assist doctors in diagnosing complex or rare conditions.
Drawing on the records of nearly 600,000 Chinese patients who had visited a pediatric hospital over an 18-month period, the vast collection of data used to train this new system highlights an advantage for China in the worldwide race toward artificial intelligence.
Because its population is so large — and because its privacy norms put fewer restrictions on the sharing of digital data — it may be easier for Chinese companies and researchers to build and train the “deep learning” systems that are rapidly changing the trajectory of health care.
On Monday, President Trump signed an executive order meant to spur the development of A.I. across government, academia and industry in the United States. As part of this “American A.I. Initiative,” the administration will encourage federal agencies and universities to share data that can drive the development of automated systems.
Pooling health care data is a particularly difficult endeavor. Whereas researchers went to a single Chinese hospital for all the data they needed to develop their artificial-intelligence system, gathering such data from American facilities is rarely so straightforward.
“You have go to multiple places. The equipment is never the same. You have to make sure the data is anonymized,” said Dr. George Shih, associate professor of clinical radiology at Weill Cornell Medical Center and co-founder of MD.ai, a company that helps researchers label data for A.I. services. “Even if you get permission, it is a massive amount of work.”
After reshaping internet services, consumer devices and driverless cars in the early part of the decade, deep learning is moving rapidly into myriad areas of health care. Many organizations, including Google, are developing and testing systems that analyze electronic health records in an effort to flag medical conditions such as osteoporosis, diabetes, hypertension and heart failure.
Similar technologies are being built to automatically detect signs of illness and disease in X-rays, M.R.I.s and eye scans.
The new system relies on a neural network, a breed of artificial intelligence that is accelerating the development of everything from health care to driverless cars to military applications. A neural network can learn tasks largely on its own by analyzing vast amounts of data.
Using the technology, Dr. Kang Zhang, chief of ophthalmic genetics at the University of California, San Diego, has built systems that can analyze eye scans for hemorrhages, lesions and other signs of diabetic blindness. Ideally, such systems would serve as a first line of defense, screening patients and pinpointing those who need further attention.
Now Dr. Zhang and his colleagues have created a system that can diagnose an even wider range of conditions by recognizing patterns in text, not just in medical images. This may augment what doctors can do on their own, he said.
“In some situations, physicians cannot consider all the possibilities,” he said. “This system can spot-check and make sure the physician didn’t miss anything.”
The experimental system analyzed the electronic medical records of nearly 600,000 patients at the Guangzhou Women and Children’s Medical Center in southern China, learning to associate common medical conditions with specific patient information gathered by doctors, nurses and other technicians.
First, a group of trained physicians annotated the hospital records, adding labels that identified information related to certain medical conditions. The system then analyzed the labeled data.
Then the neural network was given new information, including a patient’s symptoms as determined during a physical examination. Soon it was able to make connections on its own between written records and observed symptoms.
When tested on unlabeled data, the software could rival the performance of experienced physicians. It was more than 90 percent accurate at diagnosing asthma; the accuracy of physicians in the study ranged from 80 to 94 percent.
In diagnosing gastrointestinal disease, the system was 87 percent accurate, compared with the physicians’ accuracy of 82 to 90 percent.
Able to recognize patterns in data that humans could never identify on their own, neural networks can be enormously powerful in the right situation. But even experts have difficulty understanding why such networks make particular decisions and how they teach themselves.
As a result, extensive testing is needed to reassure both doctors and patients that these systems are reliable.
Experts said extensive clinical trials are now needed for Dr. Zhang’s system, given the difficulty of interpreting decisions made by neural networks.
“Medicine is a slow-moving field,” said Ben Shickel, a researcher at the University of Florida who specializes in the use of deep learning for health care. “No one is just going to deploy one of these techniques without rigorous testing that shows exactly what is going on.”
It could be years before deep-learning systems are deployed in emergency rooms and clinics. But some are closer to real-world use: Google is now running clinical trials of its eye-scan system at two hospitals in southern India.
Deep-learning diagnostic tools are more likely to flourish in countries outside the United States, Dr. Zhang said. Automated screening systems may be particularly useful in places where doctors are scarce, including in India and China.
The system built by Dr. Zhang and his colleagues benefited from the large scale of the data set gathered from the hospital in Guangzhou. Similar data sets from American hospitals are typically smaller, both because the average hospital is smaller and because regulations make it difficult to pool data from multiple facilities.
Dr. Zhang said he and his colleagues were careful to protect patients’ privacy in the new study. But he acknowledged that researchers in China may have an advantage when it comes to collecting and analyzing this kind of data.
“The sheer size of the population — the sheer size of the data — is a big difference,” he said.