What human doctors do that AI can’t — at least, not yet

That diagnostic gut feeling? It’s not magic. It’s based on data drawn from the senses and experience that can be quantified.

Listen 11:16
Artificial intelligence has already entered the medical world. Are there aspects of medicine where humans still have the upper hand? (mast3r/ Big Stock Photo)

Artificial intelligence has already entered the medical world. Are there aspects of medicine where humans still have the upper hand? (mast3r/ Big Stock Photo)

This story is from The Pulse, a weekly health and science podcast.

Subscribe on Apple PodcastsSpotify or wherever you get your podcasts.


Emergency room physician Avir Mitra first encountered Artificial Intelligence at his hospital in New York City, in a CT scan of a patient.

“I see the image looks OK, and then at the bottom it says, ‘This was an AI read, this is an AI interpretation, and will be verified by a radiologist like tomorrow,’ or something like that,” Mitra said. “I was like,’Wait, what?'”

Not only are intelligent machines already here, they sign their work.

“That’s apparently what we’re doing now at the hospital, some of our head CT reads are coming back through AI, and that definitely scared me,” Mitra said. “Every time I get one of these reads, I definitely go and look at the image myself, double-check the AI.”

The thing is, though, human double-check after human double-check, he’s never found a machine mistake.

“Every single one of those reads that I’ve seen from AI … has been right,” Mitra said.

A perfect batting average.

If machines can do what a radiologist does so well so often, how long until they start replacing radiologists — and other physicians? What is it that human doctors can offer that machines just don’t get?

When Mitra encounters patients in the ER, they are blank, scared slates.

“My first job is to stabilize unstable patients. So I have to be able to recognize when someone is, I guess you could say, actively dying,” Mitra said.

He has to be methodical. First step: figure out if this person is dying. If so, figure out why and how.

“I would say there’s a formula that we follow, and it’s good to stay consistent with every patient, that way you don’t miss anything. So there’s a lot of times where you can think pretty algorithmically,” Mitra said.

Avir Mitra is an emergency room physician in New York City. (Courtesy of Avir Mitra)

It’s the part of his job that’s the most machine-like. Gather data and follow it logically — the vital signs, the patient history, any current medications, physical exam. Crunch all those variables together: what medical emergency do they add up to?

“I would say there’s a good amount of the day that you could sort of be on autopilot,” Mitra said.

He said he sees a lot of the same, predictable stuff: shortness of breath; abdominal pain; vomiting. “And I could picture a computer walking through a decent amount of things,” Mitra said.

But then, there are what he calls the curve balls — when a patient checks off every logical box for one condition, but it turns out to be another.

“Like every time I’m on shift, there’s something that happens that you’re like, ‘Oh, well, that’s not how the textbook said it should go,’” Mitra said.

He trains fourth-year medical students and watches them struggle with this kind of thing.

“We had an older Asian man come in and he’s feeling basically OK, you know, we’re talking through a translator,” Mitra said.

The man didn’t feel all that bad: had some abdominal pain, was low and on his right side, which could point to appendicitis.

“But usually you should be pretty tender there. It should really hurt when you press on it. You know, he should be saying that … he doesn’t want to eat,” Mitra said.” He should be saying he’s very sick, but he’s not saying any of that, and when we’re pressing on [his belly], he’s kind of not grimacing much.”

It looked more like nothing, maybe a stomach bug, indigestion.

“So the student sees this patient and comes back and goes, ‘It’s not appendicitis because he’s still eating food and he’s not in that much pain. I think we can probably send them home,’” Mitra said.

For some reason, though, Mitra disagreed.

“And I’m probably overreacting and I’m probably being crazy, but there’s something about this guy,” Mitra said. ”I want to do this whole work-up.”

Which, by the way, was no trifle. A work-up meant all night in the hospital and expensive tests, for what could just have been gas. Mitra asked for all the extra tests anyway, and looked the results over.

“And this guy not only has appendicitis, but it’s perforated. In other words, it’s kind of exploded and it’s like a serious emergency,” Mitra said. “And so, you know, surgery comes down and they rush him off.”

His student was doing everything by the book, by the obvious data, but his conclusion could have killed the man. What was Mitra picking up on that the student didn’t?

Subscribe to The Pulse

“I’m getting the sense that, you know, this, this is a guy who wouldn’t come to the hospital unless he was dying. He’s stoic,” Mitra said. ”For him to feel bad enough to come to the ER and say, yeah, it hurts a little bit — like I’m realizing I probably would’ve gone to the ER five days earlier than this guy and been crying, you know what I’m saying?”

Mitra followed his instincts, his gut feelings, and they were right.

This is something machines don’t have yet.

More than a feeling

Mohammed Ghassemi, a computer scientist at Michigan State University who studies systems that combine human and artificial intelligence, said the first thing you have to understand is data.

“I’m using data here in the broadest possible sense, everything you see is data for you personally, everything you … (taste) and touch and feel, and experience is a kind of data that you’re ingesting through your sensory equipment on your body,” Ghassemi said. “And your first reaction to that, I think it is safe to say, is your gut feeling.”

Ghassemi and his team discovered that human doctors and AI ordered different amounts of tests for patients when given the same clinical data.

Why the difference? He thinks something beyond the clear-cut data was guiding the doctors. How the doctors felt about the patient factored into their decision-making in a way the machines couldn’t replicate with raw clinical data.

The team figured out how doctors felt about patients by looking at their notes, the little bits of subjective impression they got dealing with patients.

“If a physician says the patient’s heart rate was X, Y, Z, that’s kind of redundant, because you already have that in a database somewhere. So it’s irrelevant … if the physician says that in his notes,” Ghassemi said. “If they said that the patient’s heart rate was X, Y, Z,  you know, ‘That’s not good.’ The ‘That’s not good’ part is actually what’s relevant there.”

“That’s not good” is a negative sentiment; something like “He’s got a lot of spirit,” is a good sentiment. Ghassemi’s team figured out the sentiments doctors had about patients related directly to how many tests they ordered.

“It’s sort of like an upside-down U in its shape. If the physician is very, very positive about the patient, they tended to order less exams, ask for less treatments. If the physicians were very, very negative, they also ordered less exams,” Ghassemi said. “It was when they were in that middle place, on the fence, that you saw much more exams being ordered.”

When they had the least amount of data available to them — as Mitra does in the ER with a patient he’s never seen before — the doctors Ghassemi studied relied the most on a kind of gut feeling.

Another word for it is intuition. Scientists have studied this feeling, and quantified it. It’s not magic. As best as science understands it now, it’s the brain using information and logic it has learned over time and data it has acquired unconsciously to make decisions automatically.

Isn’t that exactly what intelligent machines are supposed to do, the machine learning at the heart of AI?

“Because it’s like, what is intuition other than subconscious pattern recognition that’s happened? You know, that’s what experience is,” Mitra said. “That’s why people trust doctors with gray hair. I get better with time, because it’s not about how smart you are, it’s about how many things you’ve seen.”

Why did Mitra get that feeling about the patient with abdominal pain? Maybe because he’s seen that patient before, or some version of him — a patient who bucks the algorithm completely to have the opposite diagnosis the textbook would suggest.

“I’ve just learned this by getting burned on this before,” Mitra said.

He’s been fed enough data. Couldn’t you do the same thing to a sufficiently advanced computer?

“A computer that can be fed so much data, one could say would probably have intuition,” Mitra said. “You know, it doesn’t know why this is this way, but it’s seen millions and millions of this presentation and in this way.”

Sentiments and sensibility

Ghassemi said he’s actually developing a way to teach machines how to use something like those gut feelings the human doctors used. He can encode the sentiments they mentioned in their notes into the data streams the machines get, and he thinks the machines will get closer to human decision-making.

“As you collect very, very large sums of information, and if you give AI models access to that and time to process it, yeah, probably their ability to approximate what human beings do will become better,” he said.

But the machines will still need humans — even if they replicate all of our gut instincts.

Ghassemi said machines need to learn from us — for example, from our mistakes. Human doctors do things in different ways — let’s say, in prescribing medications, the doses vary —  and in some ways that are more optimal than others, Ghassemi calls this normal variation in the way doctors work “noise.”

“If there’s no noise in the system, which human beings, for better or worse, kind of introduce, you can’t learn what’s optimal because you’ve only ever tried one thing,” he said.

On hearing that machines need our mistakes, our imperfection, where one person might hear human error, Mitra heard human innovation. Let’s say a doctor in Ohio tries something new, it could be anything: wrap a sprain this way instead of that; reattach a tendon with two sutures instead of four.

“And so she tries it this way and then tells her colleagues about it, and they start doing it. And then, the next thing you know, like Ohio’s doing this procedure in this way,” Mitra said. “And then they write a paper about it, and it gets picked up in California, and they start doing it this way.”

When a doctor does something other than the most optimal but it works really well, we don’t call it a mistake. We call it a breakthrough.

As Mitra sees it, though, a lot of AI is far in the future — and there are much more basic challenges in health care he sees every day in the real-world present.

“Patients in America aren’t getting the care they need … not because of intelligence, whether it be real or artificial, it’s because of bureaucratic stupidness and financial things and weird incentives that lead to people dying from things that really they don’t need to die from,” he said.

Before hospitals get more intelligent machines, Mitra hopes they get more human, and more humane.

Listen to A.I. Nation

WHYY is your source for fact-based, in-depth journalism and information. As a nonprofit organization, we rely on financial support from readers like you. Please give today.

Want a digest of WHYY’s programs, events & stories? Sign up for our weekly newsletter.

Together we can reach 100% of WHYY’s fiscal year goal