- Semi-Supervised Learning
- Manifold Learning
- Graphical Models
- Machine Learning
- Data Mining
- Convex Optimization
Ph.D work in Computer Science:
Semi-Supervised learning on multi-aspect data.
Masters work in Mathematics:
Application of manifold embedding methods to anomaly detection.
Predicting affiliations in the political Blogosphere:
We extracted 1213 political blogs from Technorati. Using RSS feeds
and a MySQL database we collect posted articles on a daily bases. Since
January 2007 the database has grown to about 300,000 articles. In addition
to the posts we also have a social network represented by hyperlinks. Using
a newly developed label propagation method, Iterative Label Propgation (ILP),
we predict political affiliations of bloggers based on their posts and their
network. Our data set also has accurate date information, which makes it suitable
for temporal analysis.
With dozens cheap flight websites out there, it can be difficult for consumers
to find the flight options which are of interest. One might find hundreds of
flights in a similar price range, on several websites which then become
time consuming to process. Additionally the consumer might have other preferrences
which are typically not relfected on currently available websites. The objective
of this project is to find "similar" schedules. We are using graphical models to
accomplish this. The problem is treated in a semi-supervised fashion by taking
consumer preferrences into account.
Machine Learning Toolbox (MALT):
Motivated by some painful and time consuming experiences while running experiments
I have developed an object oriented machine learning toolbox in MATLAB.
Most poeple are not even aware that MATLAB supports object oriented programming.
This toolbox is not just a collection of algorithms, but rather a powerful
framework which was designed with the goal in mind to make it very easy
and efficient to use and evaluate algorithms. Some of the features of MALT include
MALT will be released to the community sometime this Spring.
- Various state-of-the art models and methods implemented
- Uniform way to pass arguments to all to algorithms
- Uniform way to invoke algorihms
- Generic cross validation which runs on supervised and semi-supervied models
- Ability to add or rerun algorithms on existing cross validation results
- Ability to store and load result sets and data sets in a primitive local database
- Ability to produce paper-ready plots from result sets
Anomaly Detection in Transportation Corridors:
Secure transportation corridors are networks of roads, equipped with weigh stations, where trucks are required to
pass regularly. Each time a truck passes a number of different measurements is taken. A setup like this has been created
as part of the SensorNet project. Our work was a collaboration with the Oak Ridge
National Labs. The goal was to aid officers in determining which trucks are suspicious and should be examined further. We
used manifold embedding methods for feature preprocessing, prior to doing anomaly detection. We
illustrated on both real and artificial data that embedding methods can indeed help reveal the structure of anomallies.
Furthermore in some cases embedding the data lead to simplified anomaly detection.
Trading Agent Competition - Supply Chain Management (TAC-SCM):
I worked on this project with Dr. Maria Gini. Our group was involved
in the competition since 2003. In 2005 and 2006 our agent advanced into the finals and earned fifth place. My work
ranged from implementing the agent, analyzing games, researching strategies to building the procurement component
of the agent. The 2006 competition was my last one.