The goal of this research is to explore supervised and unsupervised
approaches to word sense disambiguation utilizing information from
the Unified Medical
Language System (UMLS).
Thesis Proposal : Accurate and Scalable Word Sense Disambiguation in the Biomedical Domain
Software
CuiTools (Coo-e Tools)
is a freely available package of Perl programs for supervised word sense
disambiguation (WSD) experiments. The name CuiTools comes from the Concept
Unique Identifiers (CUIs) found in the Unified Medical Language System
(UMLS). This package allows the user to extract features from the UMLS,
such as CUIs and semantic types, as well general English features such
as part-of-speech and unigram information. The package allows for
experiments to be conducted with any of the machine learning algorithms
in the WEKA data-mining package.
Presentations
-
Thesis Proposal Slides (pdf)
-
Representing Meaning in Unsupervised WSD. Bridget T. McInnes. National Library of Medicine's Brown Bag Series. (ppt)
Posters
-
Using Domain Specific Information for Word Sense Disambiguation.
Bridget T. McInnes, Ted Pedersen and John Carlis. Grace Hopper
Conference for Women in Computing, October 2007, Orlando, Florida.
(poster presentation) pdf
Publications
-
An Unsupervised Vector Approach to Biomedical Term Disambiguation:
Integrating UMLS and Medline. Bridget T. McInnes.
To appear In Proceedings of the Assocation for Computational
Linguistics Student Research Workshop (ACL-SRW) 2008.
(paper: pdf)
(poster: pdf)
-
Using UMLS Concept Unique Identifiers (CUIs) for Word Sense
Disambiguation in the Biomedical Domain. Bridget T. McInnes,
Ted Pedersen, and John Carlis. In Proceedings of the Annual
Symposium of the American Medical Informatics Association (AMIA),
pages 533-37, Nov. 2007, Chicago, IL.
pdf
(slides:
pdf
ppt
)
Reports
-
National Library of Medicine Research Participation Report.
Bridget T. McInnes. 2008.
pdf