Leveraging LLMs for Streamlined Data Extraction in Healthcare
The NSQIP Challenge
The National Surgical Quality Improvement Program (NSQIP) aims to gather patient data to enhance post-operative care and improve clinical outcomes. To participate, hospitals must identify qualifying patients based on specific procedures performed within set timeframes.
Traditionally, healthcare staff or data abstractors spend hours manually reviewing patient notes to identify these details; a process that’s both time-consuming and prone to error. While SQL queries can expedite the search by matching specific CPT codes, billing processes aren’t always up-to-date, leaving clinical staff to fill in the gaps by manually reviewing notes.
Introducing AutoChart
At Cognome, we developed AutoChart to tackle this very challenge. Building on the concepts discussed in our earlier articles (where we explored issues like hallucinations, RAG, and domain-specific fine-tuning), AutoChart uses Language Models to:
- Extract Procedure Information: A Language Model combs through patient notes to pinpoint which procedures were performed. This step replaces manual review and ensures that even if billing codes aren’t recorded yet, critical details aren’t missed.
- Assign CPT Codes via RAG: Once the procedure name is identified by the Language Model, AutoChart relies on a Retrieval-Augmented Generation (RAG) approach. We maintain a vector store containing CPT codes, their descriptions and synonyms. The extracted procedure is used to query this knowledge base, returning the most relevant codes. We then pass these shortlisted codes to a fine-tuned Language Model, specialized in matching a procedure to the best CPT code(s). This synergy of extraction and RAG-driven code assignment not only saves time but also boosts accuracy compared to manual workflows.
Why It Matters
- Speed and Efficiency: Nurses and clinicians spend countless hours hunting for the right codes and patient attributes. By automating the identification and coding process, they can focus more on patient care and less on paperwork.
- Improved Consistency: Human error and fatigue often lead to inaccuracies. With the right guardials, our LLM-powered systems are designed to be consistent, ensuring higher quality of data.
- Scalability: As more hospitals adopt NSQIP or similar programs, automating data extraction becomes essential for scaling quality improvement initiatives across multiple departments or facilities.
- Continuous Improvement: Our system operates in tandem with healthcare staff, allowing them to review and correct any extraction errors or mismatched CPT codes as they finalize the cohort. Each correction, along with the rationale behind it, feeds back into the model, fine-tuning its accuracy over time. By keeping humans in the loop, we’re not replacing their expertise; we’re amplifying it—making data verification faster and more transparent while continuously refining the model’s performance.
Beyond NSQIP
While NSQIP is a perfect example of how Language Models can accelerate data abstraction for quality programs, the underlying framework is highly adaptable. From clinical trial matching to PHI de-identification, LLMs can revolutionize a variety of workflows that rely on accurate data extraction. By integrating advanced guardrails—like we described when tackling hallucinations—these systems can achieve both reliability and trustworthiness.
If you'd like to learn more about how you can collaborate with Cognome, you can learn more here.
Contact Us to Learn More
If you are interested in learning more and/or collaborating with us please get in touch by filling out the form below.