SELECT LANGUAGE BELOW

Experts raise concerns after ChatGPT Health fails to identify medical emergencies

Experts raise concerns after ChatGPT Health fails to identify medical emergencies

Study Reveals Flaws in ChatGPT Health’s Medical Advice

A recent study has found that ChatGPT Health often overlooks the need for urgent medical care and frequently fails to identify suicidal thoughts. Experts are concerned this could, you know, lead to unnecessary harm and even fatalities.

OpenAI introduced the “Health” feature of ChatGPT to a limited audience back in January. They market it as a way for users to securely connect their medical records and wellness apps to receive tailored health advice. Reports suggest that over 40 million people turn to ChatGPT daily for health-related inquiries.

The first independent safety evaluation of ChatGPT Health, published in the February issue of Nature Medicine, revealed that it misclassified more than half of the cases presented to it.

Dr. Ashwin Ramaswamy, the study’s lead author, mentioned that they aimed to address a fundamental safety question: if a person experiences a real medical emergency and asks ChatGPT Health what to do, would it advise them to go to the emergency room?

To explore this, Ramaswamy and his team devised 60 realistic patient scenarios, ranging from minor illnesses to emergencies. Three independent physicians assessed each scenario, determining the necessary level of care based on established clinical guidelines.

Next, they sought advice from ChatGPT Health for each case, modifying conditions such as the patient’s gender or adding test results, resulting in nearly 1,000 responses.

They then compared ChatGPT’s recommendations to the doctors’ evaluations, and while it performed adequately in straightforward emergencies like strokes or severe allergic reactions, it faltered in other situations. For instance, in one asthma case, despite indicating early signs of respiratory failure, it suggested waiting rather than seeking immediate help.

In 51.6% of circumstances where hospital visits were warranted, the platform recommended staying home or scheduling a routine appointment. Alex Ruani, a researcher focusing on health misinformation at University College London, labeled this finding as “unbelievably dangerous.”

Ruani expressed concern that users having severe symptoms might receive a false sense of security from the AI’s advice. For example, if ChatGPT told someone to “just wait” during a diabetic crisis or an asthma attack, that reassurance could be life-threatening.

In one simulation, the AI directed a woman struggling to breathe to a follow-up appointment—an appointment she might not survive. Additionally, data showed that 64.8% of individuals in stable condition were unnecessarily advised to seek immediate care.

The platform also tended to downplay symptoms if the scenario involved a “friend” suggesting that it wasn’t serious, almost twelve times more likely to do so as a result.

According to Ruani, many researchers are now pushed to implement clearer safety guidelines and independent reviews to mitigate preventable harm.

A spokesperson from OpenAI responded, welcoming independent research assessing AI in healthcare, but also noted that the study didn’t capture typical real-world use of ChatGPT Health. The spokesperson added that the model is constantly refined and updated.

Ruani pointed out that even though simulations were utilized, “a plausible risk of harm justifies stronger safeguards and independent oversight.”

Ramaswamy, who teaches urology at the Icahn School of Medicine, expressed particular worry over the platform’s inadequate response to suicidal thoughts.

He explained that when they tested ChatGPT Health with a patient expressing thoughts of overdosing, a crisis intervention prompt appeared readily. However, when researchers introduced normal lab results, the intervention prompt disappeared—happening in zero of 16 attempts. This inconsistency raises significant concerns, suggesting a guardrail based on lab mentions is not robust and might even be more hazardous than having none.

Professor Paul Henman, a digital sociologist from the University of Queensland, emphasized the importance of the findings. He warned that if ChatGPT Health were commonly used, it could lead to increased unnecessary medical visits for minor issues while also causing critical cases to go untreated, potentially resulting in harm or death.

Henman noted the legal implications this could bring, especially with ongoing lawsuits against tech companies linked to self-harm and suicide following interactions with AI chatbots.

He also expressed uncertainty regarding OpenAI’s objectives and the specific training and warnings incorporated into ChatGPT Health.

“Since we don’t fully understand how ChatGPT Health was trained, we can’t accurately gauge what might be embedded in its models,” he remarked.

Facebook
Twitter
LinkedIn
Reddit
Telegram
WhatsApp

Related News