| Home > In process > Comparison of online radiologists and large language model chatbots in responding to common radiology-related questions in Chinese: a cross-sectional comparative analysis > print |
| 001 | 285039 | ||
| 005 | 20260209101110.0 | ||
| 024 | 7 | _ | |a 10.21037/qims-2025-1716 |2 doi |
| 024 | 7 | _ | |a 2223-4292 |2 ISSN |
| 024 | 7 | _ | |a 2223-4306 |2 ISSN |
| 037 | _ | _ | |a DZNE-2026-00163 |
| 082 | _ | _ | |a 610 |
| 100 | 1 | _ | |a Ji, Jiang |b 0 |
| 245 | _ | _ | |a Comparison of online radiologists and large language model chatbots in responding to common radiology-related questions in Chinese: a cross-sectional comparative analysis |
| 260 | _ | _ | |a Hong Kong |c 2026 |b AME Publ. |
| 336 | 7 | _ | |a article |2 DRIVER |
| 336 | 7 | _ | |a Output Types/Journal article |2 DataCite |
| 336 | 7 | _ | |a Journal Article |b journal |m journal |0 PUB:(DE-HGF)16 |s 1770628099_24273 |2 PUB:(DE-HGF) |
| 336 | 7 | _ | |a ARTICLE |2 BibTeX |
| 336 | 7 | _ | |a JOURNAL_ARTICLE |2 ORCID |
| 336 | 7 | _ | |a Journal Article |0 0 |2 EndNote |
| 520 | _ | _ | |a Background: Additional avenues for medical counseling are needed to better serve patients. In handling medical counseling, large language model chatbots (LLM-chatbots) have demonstrated near-physician expertise in comprehending enquiries and providing professional advice. However, their performance in addressing patients’ common radiology-related concerns has yet to be evaluated. This study thus aimed to investigate the effectiveness and model performance of LLM-chatbots (DeepSeek-R1 and ChatGPT-4o) in radiology-related medical consultation in the Chinese context through both subjective evaluations and objective metrics.Methods: In this cross-sectional study, common radiology-related questions were collected from the HaoDF online platform, one of the largest Chinese public healthcare service platforms. All questions were posed to the LLM-chatbots from February 24 to February 30, 2025. To facilitate comparison between LLM-chatbots and online radiologists, three senior radiologists from different medical centers were recruited as reviewers, and they blindly scored LLM-generated responses using a 5-point Likert scale across the three subjective dimensions: quality, empathy, and potential harm. Objective metrics including textual features (six metrics across three linguistic dimensions: lexical, syntactic, and semantic), response time, and self-improvement capacity were calculated as additional evaluators for the performance of the two LLM-chatbots.Results: A total of 954 reviews were generated for 318 responses to 106 questions. LLM-chatbots achieved superior scores in quality, empathy, and potential harm as compared to the online radiologists (all P values <0.001). Among the LLM-chatbots, DeepSeek-R1 outperformed ChatGPT-4o in terms of quality scores [DeepSeek-R1: mean 4.40, standard deviation (SD) 0.57; ChatGPT-4o: mean 3.73, SD 0.64; P<0.001]. The response times were significantly shorter for DeepSeek-R1 [median 56.00 s; interquartile range (IQR), 47–67 s] and ChatGPT-4o (median 12.17 s; IQR, 10.91–15.85 s) as compared to online radiologists (median 6,487.90 s; IQR, 3,530.50–29,061.70 s), and the LLM-chatbots generated greater textual complexity (as measured by six metrics across three linguistic dimensions: lexical, syntactic, and semantic) (all P values <0.001). Among the two chatbots, ChatGPT-4o generally produced linguistically simpler responses (all P values <0.001), with comparatively shorter response times (median 12.17 s; IQR, 10.91–15.85 s), than did DeepSeek-R1 (median 56.00 s; IQR, 47–67 s) across various topics (P<0.001). Additionally, both LLM-chatbots demonstrated a degree of self-improvement ability.Conclusions: These findings highlight the potential utility of LLM-chatbots in addressing the common radiology-related inquiries initially posed by patients. However, further optimization and validation are required to establish this emerging technology as a productive and effective pathway in medical counseling. |
| 536 | _ | _ | |a 352 - Disease Mechanisms (POF4-352) |0 G:(DE-HGF)POF4-352 |c POF4-352 |f POF IV |x 0 |
| 588 | _ | _ | |a Dataset connected to CrossRef, Journals: pub.dzne.de |
| 700 | 1 | _ | |a Li, Chenguang |b 1 |
| 700 | 1 | _ | |a Fu, Yibin |b 2 |
| 700 | 1 | _ | |a Zhao, Zihao |b 3 |
| 700 | 1 | _ | |a Wu, Yiyang |0 P:(DE-2719)9002415 |b 4 |u dzne |
| 700 | 1 | _ | |a Liang, Changhua |b 5 |
| 700 | 1 | _ | |a Wu, Yue |b 6 |
| 773 | _ | _ | |a 10.21037/qims-2025-1716 |g Vol. 16, no. 2, p. 129 - 129 |0 PERI:(DE-600)2653586-5 |n 2 |p 129 |t Quantitative imaging in medicine and surgery |v 16 |y 2026 |x 2223-4292 |
| 856 | 4 | _ | |u https://pub.dzne.de/record/285039/files/DZNE-2026-00163.pdf |y Restricted |
| 856 | 4 | _ | |u https://pub.dzne.de/record/285039/files/DZNE-2026-00163.pdf?subformat=pdfa |x pdfa |y Restricted |
| 910 | 1 | _ | |a Deutsches Zentrum für Neurodegenerative Erkrankungen |0 I:(DE-588)1065079516 |k DZNE |b 4 |6 P:(DE-2719)9002415 |
| 913 | 1 | _ | |a DE-HGF |b Gesundheit |l Neurodegenerative Diseases |1 G:(DE-HGF)POF4-350 |0 G:(DE-HGF)POF4-352 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-300 |4 G:(DE-HGF)POF |v Disease Mechanisms |x 0 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0200 |2 StatID |b SCOPUS |d 2024-12-18 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0300 |2 StatID |b Medline |d 2024-12-18 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0199 |2 StatID |b Clarivate Analytics Master Journal List |d 2024-12-18 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0160 |2 StatID |b Essential Science Indicators |d 2024-12-18 |
| 915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0113 |2 StatID |b Science Citation Index Expanded |d 2024-12-18 |
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0150 |2 StatID |b Web of Science Core Collection |d 2024-12-18 |
| 915 | _ | _ | |a JCR |0 StatID:(DE-HGF)0100 |2 StatID |b QUANT IMAG MED SURG : 2022 |d 2024-12-18 |
| 915 | _ | _ | |a IF < 5 |0 StatID:(DE-HGF)9900 |2 StatID |d 2024-12-18 |
| 920 | 1 | _ | |0 I:(DE-2719)1110001 |k AG Herms |l Translational Brain Research |x 0 |
| 980 | _ | _ | |a journal |
| 980 | _ | _ | |a EDITORS |
| 980 | _ | _ | |a VDBINPRINT |
| 980 | _ | _ | |a I:(DE-2719)1110001 |
| 980 | _ | _ | |a UNRESTRICTED |
| Library | Collection | CLSMajor | CLSMinor | Language | Author |
|---|