|The Health Record Review
by Jeff Rowe, Editor
Posted on Tue, Nov 06, 2012 - 11:10 am
To a considerable extent, health IT is focused on making communication easier. But what happens when the machine can’t understand what doctors are trying to say?
The folks at MIT may have an answer.
According to this article, they’re in the process of developing a method of “disambiguating the senses of words used in doctors’ clinical notes.” The article begins by noting the tension between “the wealth of medically useful data” that is potentially found in general physician notes (“hidden correlations between symptoms, treatments and outcomes, for instance, or indications that patients are promising candidates for trials of new drugs,”) and the fact that “much of that data . . . is buried in physicians’ freeform notes. One of the difficulties in extracting data from unstructured text is what computer scientists call word-sense disambiguation.”
There are currently systems that tackle this problem, but according to the MIT researchers their new “algorithm identifies correlations not only between words but between words and other textual ‘features’ — such as the words’ syntactic roles. If the word ‘discharge’ is preceded by an adjective, for instance, it’s much more likely to refer to a bodily secretion than to an administrative event.
Other features planned include incorporating “ listings in a huge thesaurus of medical terms, compiled by the National Institutes of Health, called the Unified Medical Language System (UMLS),” and the possibility of “additional syntactic and semantic features that could help with word disambiguation and with word associations established by NIH’s Medical Subject Headings paper-classification scheme.”
“It’s still not perfect, because we haven’t integrated all the linguistic features that we want to,” the lead researcher observed. “But my hunch is that this is the way to go.”
Photo courtesy of rosemilkinabottle via Creative Commons