2025 Proffered Presentations
S143: A COMPARATIVE EVALUATION OF CUSTOM CHATGPT-4O (CHET-GPT) VERSUS EPIC DAX FOR NEUROSURGICAL SOAP NOTES IN SKULL BASE SURGERY
Anthony Guidotti1; Ethan Cline1; Shravan Atluri, MHA1; Ondrej Choutka, MD2; 1Idaho College of Osteopathic Medicine; 2Saint Alphonsus Regional Medical Center
Objective: This study evaluates the performance of a custom neurosurgical-focused ChatGPT-4o (CHET-GPT) in generating SOAP notes from patient encounters, compared to EPIC’s DAX system. We hypothesize that CHET-GPT will produce more accurate, thorough, and legally sound notes while improving physician efficiency by reducing time spent on documentation, particularly in skull base surgery. By tracking and documenting all elements discussed during patient encounters, CHET-GPT aims to provide neurosurgeons with enhanced accountability, accuracy, and adaptability in their note-writing process.
Methods: Two parallel methodologies were employed in this study to compare the performance of CHET-GPT and DAX. First, real-time usage of both systems was implemented in two skull base patient encounters: (1) a follow-up for a fourth ventricular tumor resection, and (2) a new patient presenting with a cavernous sinus meningioma. The encounters were recorded and both systems generated SOAP notes based on the live interaction.
Second, three additional skull base cases—posterior fossa meningioma, pituitary tumor, and vestibular schwannoma—were retrospectively analyzed. An independent AI system generated scripts simulating the most likely conversation between the physician and patient, based on the original notes. Both CHET-GPT and DAX were tasked with processing these scripts and generating SOAP notes. The neurosurgeon then evaluated the notes from both systems for accuracy, completeness, readability, legal soundness, and time efficiency. Comparative analysis focused on the ability to track critical details and infer missing patient information.
Results: Preliminary findings suggest that CHET-GPT outperforms DAX in several key areas. CHET-GPT consistently produced more thorough and concise notes, particularly in its ability to infer missing information, such as specific medications or subtle details from patient history. This enhanced the overall accuracy and legal robustness of the notes, reducing the risk of omissions or discrepancies. CHET-GPT’s ability to adapt to physician preferences over time further contributed to its advantage in usability.
However, one limitation identified was CHET-GPT’s requirement to stop and upload data every 10 minutes, which poses challenges for longer encounters. Despite this limitation, the system was found to be more efficient overall, reducing the time physicians spent on documentation and increasing availability for direct patient care.
Conclusion: CHET-GPT demonstrates significant potential as an innovative tool for neurosurgical documentation, particularly in skull base surgery. Its ability to accurately document encounters, infer details, and streamline physician workflows positions it as a valuable asset compared to traditional dictation systems like DAX. As further data are collected, CHET-GPT’s role in enhancing legal documentation, reducing errors, and improving physician efficiency is likely to expand. Continued research and optimization will be critical to overcoming current technical limitations, particularly for longer patient interactions.