ADEPT is a classifier that identifies medically-relevant words in Patient-Authored Text.

The web is full of text related to health and medicine. While some of this text is written by medical experts, most of it is written by ordinary people in the form of news articles, blog posts, and discussion on online health communities like MedHelp and PatientsLikeMe. We call this text Patient-Authored Text (PAT).

It's easy to imagine that PAT might contain useful information -- for both patients and medical experts -- but it's not so easy to extract this information. In particular, given a sentence, we would like to be able to extract words or terms of medical relevance.

ADEPT is a tool that automatically identifies medically-relevant words in PAT. ADEPT was trained by people who are not medical experts, and verified by people who are. So far, it performs better than any other tools we've found out there. Try it yourself, and tell us what you think.


ADEPT is specifically designed to extract terms from PAT (such as online health community posts), and will perform poorly on non-medical text (e.g. personal email). Think about a question you might ask your doctor, or take a look at the sample sentences, with ADEPT-ified terms in red, below.
  • It says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma.
  • Last summer I was at home with my daughter, who is now 2.
  • I had a chest xray done and they said there was something in my lung.
  • I stopped cold turkey.


You can download a serialized version of ADEPT here. This is a good option if you want to process lots of text. Detailed instructions are pending; for now, follow the instructions on the Stanford NLP Group website.


A full research paper about ADEPT was published in JAMIA.


Have some text that ADEPT didn't classify correctly? Other thoughts or comments? Let us know at:

malcdi [ at ] alumni [ dot ] stanford [ dot ] edu