My research interests encompass the fields of bioinformatics, data mining and high performance computing.
Publications
Journal papers:
- Huzefa Rangwala, George Karypis. Profile-based Direct Kernels for Remote Homology Detection and Fold Recognition in BIOINFORMATICS, 21(23):4239-4247 (2005). Supplementary Data Here
- Huzefa Rangwala, George Karypis. Building Multiclass Classifiers for Remote Homology Detection and Fold Recognition in BMC Bioinformatics, 7(455) (2006).
- Huzefa Rangwala,
George Karypis. Incremental Window-based Protein Sequence Alignments. Bioinformatics 2007 23(2):e17-e23 [Appeared as a conference
publication at ECCB 2006]
Conferences/Workshops
3. Huzefa Rangwala, Eric Lantz, Roy
Musselman, Kurt Pinnow, Brian Smith, Brian Wallenfelt. Massively
Parallel BLAST for the Blue Gene/L. Presented at the High
Availability and Performance Computing Workshop held in conjunction
with the 6th Los Alamos Computer Science Institute
Symposium, Santa Fe, New Mexico (October, 2005). Paper
Presentation
4. Huzefa Rangwala, George Karypis.
Incremental Window-based Protein Sequence Alignments. Presented
at the 5th European Conference in Computational Biology,
Eilat, Israel (January 2007).
5. Huzefa Rangwala, George Karypis.
fRMSDPred: Predicting local structure information using sequence
information. Will Present at Computational Systems Biology
Conference, Sand Deigo, CA (August 2007).
6. Huzefa Rangwala, George Karypis. fRMSDAlign: Protein Sequence Alignment using predicted local structure information. Submitted.
Book Chapters
7. Huzefa Rangwala, Kevin DeRonne, George Karypis. Protein Structure Prediction using String Kernels in Knowledge Discovery in Bioinformatics: Techniques, Methods and Applications, to be published by John Wiley and Sons (2006).
Posters:
1.Feature Mining for Prediction of Degree of Liver Fibrosis, Benjamin W Mayer, Huzefa S Rangwala, Rohit Gupta, Jaideep Srivastava, George Karypis , Vipin Kumar & Piet C de Groen, AMIA 2005.
Projects
Below are the projects I am involved with and the ones from the past.
1. Protein Fold Classification - (Spring 2004 - Present)
[University of Minnesota, Twin Cities]
- This work involves use of machine learning tools like support vector machines and neural networks, to classify proteins into various fold families. We developed similarity metrics for sequences, so as to encapsulate the three dimensional structural information based on multiple sequence alignment profiles.
2. Mining Structured and Unstructured Life Sciences Data (Fall 2004 - Present)
[Mayo Clinic, Rochester, Minnesota, USA]
- Working with structured data in the form of lab tests and unstructured data in the form of physician notes related to patients with liver cirrhosis. The challenges are dealing with irregular, temporal data to detect significant patterns in patient’s history for early detection of disease.
3. Life Science Application analysis and performance on the Blue Gene (Summer 2005)-
[IBM Corporation, Rochester, Minnesota, USA]
- As part of summer internship at IBM, I was involved with a team which dealt with analyzing the performance of Blue Gene Supercomputer on life science applications. The applications specific to bioinformatics and other areas of life sciences were ported, tested, optimized and benchmarked on the Blue Gene.
4. AstroMiner: Data Mining of Astronomical Databases (Fall 2002 & Spring 2003)
[Inter University Center for Astronomy and Astrophysics, Pune, India]
- The aim of the project was to detect geometrical clusters from astronomical catalogues. To improve the efficiency of data access in two dimensions, R-Tree indexing structure was used. The clustering algorithm implemented was Wave Cluster and a connected regions labeling algorithm was used to differentiate the clusters.
5. SMARTBOT: Autonomous Parallel Parking Robot (Fall 2003) -
[University of Minnesota, Twin Cities]
- SMARTBOT is an autonomous parallel parking robot, implemented on Pioneer I using the ARIA api. The aim is to park a robot in a given spot of any dimension optimizing the distance covered, length of spot, time and the sudden movements of the robot.
Talks/Presentations
-
Class Lecture (Fall 2005) for Csci 5481 discussing local sequence alignments, gap models :slides are by Prof. Karypis 09/15/2005
-
Machine Learning for solving protein fold classification and structure prediction. Part of MPGI Noon Brown Bag Series Seminar on 09/14/2005. Slides here
-
Mismatch String Kernels for protein classification. Part of BIJC (Spring 2005). Slides here
-
Class Lecture (Fall 2004) for Csci 5481 discussing protein fold classication. Slides here
-
Protein Classification: SVM vs NN (Spring 2004) as part of AI-2 classwork project. Slides here