Eruditionhome - Online Resources for data mining

A comprehensive list of books on various aspects of data mining


Next:
Bibliography Up: No Title Previous: Some Objective Functions

Conclusions

In this project two schemes were proposed for a graph based clustering problem. The schemes are essentially based on an input parameter namely tex2html_wrap_inline3003 which defines the minimum connectivity required to be exhibited by items within the clusters.

The effectiveness of the schemes was tested by conducting exhaustive experiments on different data sets mainly falling in two categories viz., Web Documents Data and S&P 500 Stock Market Data. All the results are presented in two contexts namely graph theoretical context and data mining context, where the labels of the items being clustered are known a-priori.

A brief analysis of the results was presented. A comparison based on entropy was done with the results presented in [22].

There are a lot of ways in which the work done in this report can be improved upon. I would suggest the following :

  1. A better scheme to break the ties in all the phases. As of now the edge weights are used for breaking the ties with preference to the higher edge weights.
  2. A more rigorous definition of a cluster. As of now the clusters are defined by the connectivity constraints specified by tex2html_wrap_inline3003 .
  3. Better Objective Functions. As of now for tex2html_wrap_inline2999 , tex2html_wrap_inline3001 is being maximized and for tex2html_wrap_inline3005 , tex2html_wrap_inline3037 is being maximized. More sophisticated objective functions can be defined.



Sushrut S Karanjkar
Tue Apr 21 17:00:32 CDT 1998