| Term Identification |
The goal of this research is to explore statistical techniques to identify
terms (and/or collocations) from raw text and determine the syntactic
structure of noun phrase terms.
Software
LogLikelihood for 3-grams
LogLikelihood for 4-grams
LogLikelihood for 5-grams
LogLikelihood Modeling for 3-grams
LogLikelihood Modeling for 4-grams
Presentations
Incorporating Ngram Statistics in the Normalization of Clinical Notes
Publications
Determining the Syntactic Structure of Medical Terms
in Clinical Notes. Bridget T. McInnes, Ted Pedersen, and
Serguei V. Pakhomov. In Proceedings of the BioNLP Workshop
at ACL, June 29, 2007, Prague, Czech Republic.
(paper:
pdf
slides:
ppt)
Resolving Structural Ambiguity of Medical Terms with Statistical
Model Fitting. Serguei V. Pakhomov and Bridget T. McInnes,
Linguistic Society of America (LSA) Presentation, 2005.
(abstract:
pdf)
Extending the Log Likelihood Measure to Improve Collocation Identification
Bridget Thomson McInnes, December 2004, University of Minnesota Duluth.
(Masters thesis: ps)