| Term Identification |
The goal of this research is to explore statistical techniques to identify
terms (and/or collocations) from raw text and determine the syntactic
structure of noun phrase terms.
Software
LogLikelihood for 3-grams
LogLikelihood for 4-grams
LogLikelihood for 5-grams
LogLikelihood Modeling for 3-grams
LogLikelihood Modeling for 4-grams
Presentations
Incorporating Ngram Statistics in the Normalization of Clinical Notes
Publications
Determining the Syntactic Structure of Medical Terms
in Clinical Notes. Bridget T. McInnes, Ted Pedersen, and
Serguei V. Pakhomov. In Proceedings of the BioNLP Workshop
at ACL, June 29, 2007, Prague, Czech Republic.
(slides:
ppt)
Resolving Structural Ambiguity of Medical Terms with Statistical
Model Fitting. Serguei V. Pakhomov and Bridget T. McInnes,
Linguistic Society of America (LSA) Presentation, 2005.
Extending the Log Likelihood Measure to Improve Collocation Identification
Bridget Thomson McInnes. Master of Science Thesis. Department of Computer Science,
University of Minnesota, Duluth, December, 2004.
The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.