David Carrell, PhD, is an assistant investigator who develops and applies technology for extracting rich information from unstructured clinical text, such as physician progress notes. This work uses state-of-the-art clinical natural language processing (NLP) technologies in single- and multi-site settings.
An example of this work is an NLP system to identify women who have been diagnosed with recurrent breast cancer. Despite being a common and consequential clinical diagnosis, recurrent breast cancer cannot be tracked reliably using standard medical codes found in a person’s chart. Supported by a grant from the National Cancer Institute, he and his colleagues used information from clinician progress notes, radiology reports, and pathology reports to classify women by breast cancer recurrence.
Working with teams of researchers inside and outside Kaiser Permanente Washington Health Research Institute, Dr. Carrell has applied similar precision phenotyping methods to identify evidence of carotid artery stenosis, colon polyps, problem use of prescription opioids, and colonoscopy quality.
Dr. Carrell’s current research projects are applying NLP and machine learning methods to improve medication safety surveillance (through the Food and Drug Administration Sentinel Initiative) and to evaluate the impact on drug use disorder diagnosis and treatment of Kaiser Permanente Washington patients screened for unhealthy cannabis and other drug use. His ongoing work also includes development and application of automated algorithms based on electronic health record data to identify patients with particular health conditions (called “patient phenotypes”) for use in genetic and epidemiological research.
Surveillance methods for adverse events associated with medication exposure, including problem use of prescription opioids
Methods for using structured and unstructured electronic health record data to identify patients with (or without) specific clinical conditions or phenotypes for large scale epidemiological and genomic studies
Identifying recurrent breast cancer using EHR text; Colonoscopy quality metrics
Recurrent breast cancer; Colonoscopy quality; Extracting information from clinical text; Automated de-identification of clinical text; Methods for applying NLP methods in multi-site research
Prevention and treatment
Shang N, Liu C, Rasmussen LV, Ta CN, Caroll RJ, Benoit B, Lingren T, Dikilitas O, Mentch FD, Carrell DS, Wei WQ, Luo Y, Gainer VS, Kullo IJ, Pacheco JA, Hakonarson H, Walunas TL, Denny JC, Wiley K, Murphy SN, Hripcsak G, Weng C. Making work visible for electronic phenotype implementation: lessons learned from the eMERGE network. J Biomed Inform. 2019 Sep 19:103293. doi: 10.1016/j.jbi.2019.103293. [Epub ahead of print]. PubMed
Gordon AS, Rosenthal EA, Carrell DS, Amendola LM, Dorschner MO, Scrol A, Stanaway IB, DeVange S, Ralston JD, Zouk H, Rehm HL, Larson E, Crosslin DR, Leppig KA, Jarvik GP. Rates of actionable genetic findings in individuals with colorectal cancer or polyps ascertained from a community medical setting. Am J Hum Genet. 2019 Sep 5;105(3):526-533. doi: 10.1016/j.ajhg.2019.07.012. Epub 2019 Aug 15. PubMed
Carrell DS, Cronkite DJ, Li MR, Nyemba S, Malin BA, Aberdeen JS, Hirschman L. The machine giveth and the machine taketh away: a parrot attack on clinical text deidentified with hiding in plain sight. J Am Med Inform Assoc. 2019 Dec 1;26(12):1536-1544. doi: 10.1093/jamia/ocz114. PubMed
Namjou B, Lingren T, Huang Y, Parameswaran S, Cobb BL, Stanaway IB, Connolly JJ, Mentch FD, Benoit B, Niu X, Wei WQ, Carroll RJ, Pacheco JA, Harley ITW, Divanovic S, Carrell DS, Larson EB, Carey DJ, Verma S, Ritchie MD, Gharavi AG, Murphy S, Williams MS, Crosslin DR, Jarvik GP, Kullo IJ, Hakonarson H, Li R; eMERGE Network, Xanthakos SA, Harley JB. GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new trait-associated genes and pathways across eMERGE Network. BMC Med. 2019 Jul 17;17(1):135. doi: 10.1186/s12916-019-1364-z. PubMed
Hripcsak G, Shang N, Peissig PL, Rasmussen LV, Liu C, Benoit B, Carroll RJ, Carrell DS, Denny JC, Dikilitas O, Gainer VS, Marie Howell K, Klann JG, Kullo IJ, Lingren T, Mentch FD, Murphy SN, Natarajan K, Pacheco JA, Wei WQ, Wiley K, Weng C. Facilitating phenotype transfer using a common data model. J Biomed Inform. 2019 Jul 17:103253. doi: 10.1016/j.jbi.2019.103253. [Epub ahead of print]. PubMed
Hazlehurst B, Green CA, Perrin NA, Brandes J, Carrell DS, Baer A, DeVeaugh-Geiss A, Coplan PM. Using natural language processing of clinical text to enhance identification of opioid-related overdoses in electronic health records data. Pharmacoepidemiol Drug Saf. 2019 Jun 19. doi: 10.1002/pds.4810. [Epub ahead of print]. PubMed
Green CA, Perrin NA, Hazlehurst B, Janoff SL, DeVeaugh-Geiss A, Carrell DS, Grijalva CG, Liang C, Enger CL, Coplan PM. Identifying and classifying opioid-related overdoses: a validation study. Pharmacoepidemiol Drug Saf. 2019 Apr 24. doi: 10.1002/pds.4772. [Epub ahead of print]. PubMed
Using doctor's notes to learn about drug reactions, dementia, and cannabis use.
Dr. Jennifer Nelson explains how KP scientists are helping the CDC and FDA keep an eye out for rare adverse events.
A Kaiser Permanente-led BCSC study is among the largest ever to evaluate adding MRI surveillance for breast cancer survivors.
Dr. Paula Lozano explains how a Learning Health System project finds Kaiser Permanente Washington members who could benefit most from preventive services.