Getiria Onsongo

University of Minnesota–Twin Cities

Getiria Onsongo
Research Assistant
Department of Computer Science & Engineering
University of Minnesota - Twin Cities
4-192 EE/CS Building
200 Union Street SE
Minneapolis, MN 55416
onsongo@cs.umn.edu
651-307-9619

“ Disciplines are distinguished partly for historical reasons and reasons of administrative convenience...We are not students of some subject matter, but students of problems. And problems may cut right across the borders of any subject matter or discipline. “

Karl Popper

I have since graduated and I am currently at the Minnesota Supercomputing Institute. You can still reach me using the email adddress in this page

Profile
I am interested in leveraging the power of computers to understand disease development. Specifically, I work to develop tools that will analyze and help make sense of the large amounts of data generated by high-throughput genomics and proteomics techniques.
I believe bioinformatics research is well served not when computer scientist with no knowledge of intricasies and complexities of biological systems apply computational techniques to biological problems, but when these techniques are developed with an understanding of the complex nature of biological systems. It is unlikely bioinformaticians will poses the same knowledge of protein structure as biochemist do but it is important to at least understand basic principles when developing protein structure prediction algorithms. To this end, I have taken several biological sciences courses such as Genetics, Biochemistry, Cell Biology and Protein Sequence to better understand basic principles of biological systems. Taking these courses has proved invaluable in my thesis work being co-advised by Dr. Timothy J. Griffin, in the Department of Biochemistry, Molecular Biology and Biophysics and Dr. John V. Carlis in the Department of Computer Science and Engineering.

My scholarly contributions include:

  • Developed a technique for accurate protein quantification of isobaric tagged peptide data from LTQ type tandem mass spectrometers.
    This technique was implemented in a freely available open source software ( LTQ-iQuant). The software has several advantages over existing software which includes: 1) compatibility with centroided LTQ MS/MS data; 2) accounts for errors introduced by low reporter ion intensities; and 3) flexible and gives users ability to customize the software to individual instruments.
    Publication resulting from this work
    Onsongo, G., Stone, M. D., Van Riper, S.K., Chilton ,J., Wu, B., Higgins, L., Lund, T.C., Carlis, J.V., and Griffin, T.G. (2010). LTQ-iQuant: A freely-available software pipeline for automated and accurate protein quantification of isobaric tagged peptide data from LTQ instruments , journal Proteomics (in press)
  • Designed and implemented relational database operators for prioritizing candidate disease biomarkers to identify the most promising candidate biomarkers worth of follow up validation studies.
    High-throughput technologies used to identify candidate diagnostic biomarkers for disease progression often lead to hundreds of candidate biomarkers which must be validated before their specificity can be tested. Because of the nature of techniques used to validate these candidate biomarkers (expensive and time consuming) it is practically not feasible to validate each candidate biomarkers. These operators help identify the most promising candidate biomarkers for additional analyses.
    Publication resulting from this work
    Onsongo, G., Xie, H., Griffin, T.J., and Carlis, J.V. (2010). Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data, ACM BCB Aug 2-4, 2010. Niagara Falls, New York.
  • Developed relational operators for Gene Ontology (GO) database to dynamically generate GO Slim version of the GO database.
    The Gene Ontology database consists of terms used to standardize the naming of genes and gene products. These terms are a set of controlled vocabulary that can be applied to all eukaryotes even as the knowledge of gene and protein roles in cells accumulates and changes. A GO Slim is a variant of the Gene Ontology database that contains a small portion of the database. For some tasks, such as analyzing results of an experiment, use of a GO Slim relevant to the experiment is preferred to using the complete Gene Ontology database. Prior to this work, users had to rely on GO Slims provided by The Gene Ontology consortium or use of a perl script to generate a static GO Slim which had to be regenerated each time the GO database changed. This work makes it possible to dynamically create GO Slims and as a result, users can generate their own custom GO Slims without relying on the generic GO slims provided by the consortium. Additionally, there will be no need to update a GO Slim when a new version of the GO database is released.
    Publication resulting from this work
    Onsongo, G., Xie, H., Griffin, T.J., and Carlis, J.V. (2010). Generating GO Slim Using Relational Database Management Systems to Support Proteomics Analysis, IEEE Symposium on Computer-Based Medical Systems 2008: 215 - 217.
  • I have also collaborated in several other interdisciplinary projects that have led to publications in competitive peer reviewed journals. Further details of these collaborations together with contribution summaries can be found here (publications)

If you have any questions, feel free to e-mail me at onsongo@cs.umn.edu.

The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.