Artificial intelligence (AI) has made impressive advances in medicine. From detecting skin cancer to interpreting radiological images, AI systems can sometimes match or even rival human experts. However, when it comes to drug therapy advice, the situation is very different.
A recent study from Hannover Medical School (MHH), published in the British Journal of Clinical Pharmacology (2025), shows that human medical expertise clearly outperforms ChatGPT in providing pharmacotherapeutic recommendations. The findings highlight both the potential and the current limitations of AI in clinical pharmacology.
Researchers compared answers generated by ChatGPT (version 3.5) with those written by physicians at the clinical–pharmacological Drug Information Centre (DIC) of Hannover Medical School. The comparison was based on 70 real-world pharmacotherapeutic queries submitted by healthcare professionals between June and October 2023.
Three independent, blinded reviewers a junior doctor, an experienced resident, and a board-certified clinical pharmacologist evaluated the responses for:
• Quality of medical information
• Factual correctness
• Overall preference
• Linguistic quality
Across all reviewers, physician-generated responses were rated significantly higher than those produced by ChatGPT. On a 5-point scale
• Physicians received a median score of 5 (“very good”)
• ChatGPT received a median score of 2 (“poor”)
Importantly, factually incorrect information appeared far more often in ChatGPT’s answers in up to 55.7% of responses compared with 5.7–17.1% for physicians.
While ChatGPT often produced longer and well-structured texts, these responses frequently lacked clinical depth, accuracy, and appropriate consideration of patient-specific factors such as age, comorbidities, renal function, and polypharmacy.
Some of the errors identified in ChatGPT’s answers were not just minor inaccuracies — they could have led to serious clinical consequences if applied in practice. Examples included:
• Incorrectly describing Actrapid (human insulin) as an analgesic
• Recommending INR monitoring for direct oral anticoagulants (DOACs), where it is not appropriate
• Misclassifying certain drugs, leading to unfounded concerns about interactions or sedation
Such mistakes could potentially result in inadequate pain control, hypoglycaemia, bleeding, or thromboembolic events. Although ChatGPT has demonstrated strong performance in medical exams and multiple-choice pharmacology questions, real-world pharmacotherapy is far more complex.
ChatGPT also suffers from a well-known limitation: “hallucinations”, confident, authoritative-sounding statements that are factually wrong or unsupported. At the time of the study, ChatGPT 3.5 could not provide references, making verification difficult.
Where AI Can Still Be Useful?
The authors emphasize that this study evaluated only one chatbot version, which is now outdated. Newer models, including ChatGPT-4, ChatGPT-5, and other AI tools such as Perplexity, Gemini, or LLaMA, may perform better, but this needs rigorous investigation.
However,AI should not replace expert judgment in drug therapy decisions.
Drug Information Centres play a vital role in ensuring the safe and rational use of medicines, especially in complex cases involving interactions, dose adjustments, or vulnerable populations. This study reinforces that their expertise cannot yet be substituted by AI chatbots.
Despite rapid technological progress, ChatGPT is currently not reliable enough for drug therapy advice. Until AI systems can consistently deliver accurate, transparent, and clinically sound pharmacological recommendations, human expertise remains indispensable.
References:
1. Krichevsky B, Engeli S, Bode-Böger SM, Koop F, Westhoff MS, Schröder S, Schumacher C, Pape T, Stichtenoth DO, Heck J. Human vs. artificial intelligence: Physicians outperform ChatGPT in real-world pharmacotherapy counselling. British Journal of Clinical Pharmacology. Published Online October 25, 2025.
Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Internal Medicine.2023;183(6):589-596.

