Can AI in healthcare mimic human biases and improve clinical decisions?
AI models like GPT-4 and Gemini-1.0-Pro can make expert-level decisions in healthcare, but they also show human-like biases. Recent studies have shown that while these AI models can help doctors with diagnosing, they do not always improve the decision-making process. In one study, doctors who had access to GPT-4 did not perform better than those using just regular tools. This suggests doctors need training on how to use AI tools effectively.
Researchers also tested AI for biases using different medical cases. In one example, AI was more likely to recommend surgery for lung cancer when survival rates were shown instead of death rates. This is known as the 'framing effect.' In another case, AI was more likely to think a person with a cough and blood in their sputum had a pulmonary embolism when the symptom was mentioned first, which shows the 'primacy effect.' In a third case, when a woman with knee pain was shown two outcomes—one where she recovered and another where she died—AI judged her care as appropriate in the first case but not the second, showing a 'hindsight bias.'
These biases in AI were even stronger than those in human doctors. This means that while AI can help doctors, it is important for doctors to question the AI's suggestions and explore other possibilities to avoid making mistakes.
Comments