Next: Hierarchical Clustering Up: Bottom Up Approach Previous: Bottom Up Approach

Agglomerative Clustering

A clustering scheme is called as agglomerative if new clusters are formed one at a time from the existing clusters. An agglomerative clustering algorithm begins with a n-way clustering solution viz. tex2html_wrap_inline5049 = tex2html_wrap_inline5051 and iteratively constructs solution tex2html_wrap_inline5053 from tex2html_wrap_inline5055 as follows :

  1. Choose two clusters tex2html_wrap_inline5057 and tex2html_wrap_inline5059 from tex2html_wrap_inline5055 .
  2. Construct tex2html_wrap_inline5053 from tex2html_wrap_inline5065 by removing tex2html_wrap_inline5057 and tex2html_wrap_inline5059 and adding merged cluster tex2html_wrap_inline5071 .

The selection of these two clusters viz. tex2html_wrap_inline5057 and tex2html_wrap_inline5059 is usually done based on improving some objective function. The criterion for choosing the clusters tex2html_wrap_inline5057 and tex2html_wrap_inline5059 is what distinguishes among the agglomerative variants. The various agglomerative schemes that were studied are described below :

  1. Scheme 1

    This scheme is due to Steven Johnson [9]. As described in the paper, the scheme starts with the initial clustering solution tex2html_wrap_inline5081 where each vertex belongs to its own cluster. The scheme is iterative in nature. At any iteration k, two clusters are namely tex2html_wrap_inline5057 and tex2html_wrap_inline5059 are chosen from the clustering solution tex2html_wrap_inline5053 at that level and are merged together.The selection criterion for clusters tex2html_wrap_inline5057 and tex2html_wrap_inline5059 for this scheme is

    Minimize tex2html_wrap_inline5095 = Diameter(C) where tex2html_wrap_inline5071 .

    Diameter(C) is defined as follows. tex2html_wrap_inline5103

    where P(x,y) represents a path from vertex x to vertex y.

    Thus those clusters are selected for merging which yield the minimum diameter of resultant cluster. After this selection is done, the clusters tex2html_wrap_inline5057 and tex2html_wrap_inline5059 are removed from the clustering solution and a new cluster named tex2html_wrap_inline5071 is added to generate the next clustering solution tex2html_wrap_inline5055 . The process is repeated till the desired number of clusters are obtained or till the clusters become "strong" enough.

       figure2023
    Figure: A Scenario For Clustering

    Comments:

    1. Here the objective function is the minimization of the diameter of the resultant cluster as defined above. This however does not seem to be a very good objective function. Consider the scenario in Fig gif. Here minimization of the diameter would yield no preference between both the situations. However clearly, clustering nodes in Fig.8.a is more desirable than to cluster the nodes in Fig.8.b. Thus this objective function fails to capture the degree by means of which the nodes are connected within a cluster. It merely captures the fact that they are connected which does not seem particularly useful.
    2. The clustering objective clearly does not seem particularly useful in the context of multilevel formulation. As in Fig gif if the tie is broken in a random fashion at very top level, then at the next levels the scheme may lose important connectivity information, thereby giving rise to pretty bad clusters.
    3. However it is quite clear that the diameter is an attractive choice as one of the parameters for the objective function. Clearly an objective which is a function of the size of the cluster, the number of edges inside the cluster and the diameter could be a very good choice for the sake of being considered as an objective function. i.e. tex2html_wrap_inline5121

      Where |C| = size of cluster (C), and
      tex2html_wrap_inline5125 = size of the set of edges completely contained within C.

      could be a very good function for the sake of clustering. As an example tex2html_wrap_inline5129

      could serve as a good objective function. One needs to experiment with these parameters to find out a good objective function. Such a function would be very useful for multilevel formulation as it would capture all the necessary information about the connectivity and the degree of connectivity as well in the whole graph, as well as graph at each level.

  2. Scheme 2

    This scheme is due to Karger[10]. The scheme mainly talks about a graph bisection problem with the min-cut formulation.The scheme begins with claiming all the vertices as isolated clusters i.e, the starting solution is tex2html_wrap_inline5081 . Iteratively a random edge is chosen and its incident clusters are contracted into a single cluster. The process is repeated till two "meta" vertices are obtained which is the final solution for the graph bisection problem. Authors of [3] claim that this scheme can be utilized for the clustering problem. First of all the approach can be extended to hypergraphs by picking up a hyperedge at random and then contracting two random incident clusters. As an alternative, all the clusters incident to the hyperedge can also be collapsed into a single vertex. The process is repeated till the desired number of clusters is found or till the clusters become "strong" enough. An alternative greedy approach would be to merge two clusters with highest connectivity i.e., for the sake of choosing tex2html_wrap_inline5057 and tex2html_wrap_inline5059 at an iteration k from the current solution tex2html_wrap_inline5053 , the objective function would be

    Maximize tex2html_wrap_inline5141 .

    Here tex2html_wrap_inline5143 is the set of hyperedges cut by the cluster tex2html_wrap_inline5145 and is given by,

      equation2046

    Comments :

    1. Here the objective function is maximization of tex2html_wrap_inline5147 as defined above. This however does not seem to be a good objective function. Consider the scenario in Fig gif. Clearly the vertices of Fig.9.b are more closely connected than the vertices of Fig.9.a. So clearly merging the two vertices in Fig.9.b would be a better choice. However the tex2html_wrap_inline5147 parameter for the clusters in Fig.9.a is more than that for the clusters in Fig.9.b. Hence the scheme will prefer the clusters in Fig.9.a over those in Fig.9.b for the sake of merging, which is clearly undesirable.

         figure2056
      Figure: A Scenario For Clustering

    2. The main drawback of this scheme is that it does not take into consideration the sizes of the clusters that are being considered for merging. Hence the scheme seems to prefer large clusters for the sake of merging as the tex2html_wrap_inline5147 for a cluster will usually be higher if the cluster is large. Hence this objective formulation will not be so useful for a multilevel scheme as it tends to prefer larger clusters for merging.
    3. This objective clearly does not take into account the sizes of the clusters which it is considering for merging. A good choice for the objective function would be the one which takes into account the sizes of the cluster being merged along with the number of edges that go between them. Hence the objective function should be a function as shown: Maximize tex2html_wrap_inline5153

      An example objective function could be Maximize tex2html_wrap_inline5155

      One needs to experiment with these parameters to find out a good objective function. Such a function would be very useful for multilevel formulation as it would capture all the necessary information about the connectivity and the degree of connectivity as well in the whole graph, as well as graph at each level.

  3. Scheme 3

    This scheme is due to Schuler and Ulrich [11]. This work is based on an observation regarding the previous scheme. The previous scheme ignores the number of edges cut by the merged cluster. It also fails to construct a balanced clustering. It seems to be giving preference to larger and larger clusters to collapse in spite of the fact that there could be situations where it is not a wise choice to do so. Thus taking these shortcomings of the previous scheme into consideration the authors have come up with a better objective function. The authors try to select tex2html_wrap_inline5057 and tex2html_wrap_inline5059 so as to maximize the objective of the form :

    tex2html_wrap_inline5161

    where f is a function of the cluster weights. A simple example of f could be tex2html_wrap_inline5167 . In this case the formation of larger clusters is penalized by the weight of the cluster in the denominator. The two large clusters wont be merged together unless the affinity between them is strong enough.

    Comments:

    1. The objective here is to maximize the F as defined above. The maximization function essentially tries to capture the connectivity between the clusters that are considered for the sake of merging. The terms tex2html_wrap_inline5171 and tex2html_wrap_inline5173 can be made to capture the sizes of the clusters as well. Thus the scheme overcomes the discrepancies of the previous scheme due to Karger[10].
    2. The objective function seems to be a very good choice for the sake of multilevel formulation as it does capture all the necessary information about the connectivity in the graph at each level.
    3. The objective function in essence tries to maximize the connectivity within the clusters thereby minimizing the inter-cluster similarity. Thus this seems to be similar to the schemes proposed in this report. The later tries to maximize the connectivity within the clusters subject to a particular constraint about the connectivity specified by an input parameter tex2html_wrap_inline3003 .
    4. The objective is pretty generalized to capture the essential information. One needs to experiment with the terms tex2html_wrap_inline5177 and tex2html_wrap_inline5179 for the sake of obtaining a good clustering solution.
  4. Scheme 4

    This particular scheme is due to Shin and Kim[12]. The authors have presented a hierarchical scheme for obtaining a good partitioning. It works as follows :
    The scheme consists of two phases. In the first phase initial clusters are constructed from the original graph and then partitioned to find a good initial solution. In the second phase, the clusters are flattened and then partitioned using a Gradual Constraint Enforcing Partitioning (GCEP) method. In the first phase, the clusters formed are partitioned using GCEP a lot of times and the best result is saved. This result is used as a starting point for the second phase which flattens the clusters first and then apply GCEP on it to obtain the final solution. The basic idea is to allow the closely coupled vertices to settle first and then allow freer vertices to move closer to an optimum bipartitioning in the later passes. The GCEP is actually a slight variant of the F&M algorithm. It consists of making multiple passes over the data where the cluster size constraints become tighter and tighter within each pass. The authors name their scheme as HGCEP ( Hierarchical Gradual Constraint Enforcing Partitioner), and it seems to be less dependent on the initial solution than the F&M method.
    For the first phase of the scheme, a good clustering of the initial graph is needed. The clustering scheme followed is same as the standard agglomerative scheme with the clusters selection criterion for merging as follows :
    Given tex2html_wrap_inline5055 , choose tex2html_wrap_inline5057 and tex2html_wrap_inline5059 which Maximize tex2html_wrap_inline5187

    The first term captures the connectivity between the clusters, with the denominator favoring the selection of a cluster which has connections to only one other cluster. The second term implies a penalty for making a cluster too large. If the merged cluster has weight greater than the average cluster weight, then it is penalized by some factor. Thus the authors have developed a clustering approach for use with two phase FM. The scheme makes many runs on the clusters found to get an initial good partition. Then the solution is flattened and a partition is obtained by a F & M variant scheme named GCEP. The value of tex2html_wrap_inline5189 reported by the authors is 0.5/200. Fig gif shows why the authors have chosen the function tex2html_wrap_inline5193 instead of choosing tex2html_wrap_inline5195 in the denominator of the first term.

       figure2093
    Figure: A Scenario For Clustering

    So let tex2html_wrap_inline5197 and

    let tex2html_wrap_inline5199

    So as shown in the figure,
    tex2html_wrap_inline5201
    tex2html_wrap_inline5203
    tex2html_wrap_inline5205
    tex2html_wrap_inline5207

    Hence tex2html_wrap_inline5209 prefers merging tex2html_wrap_inline5211 and tex2html_wrap_inline5213 together to merging tex2html_wrap_inline5215 and tex2html_wrap_inline5217 while tex2html_wrap_inline5219 does not show any preference. As claimed by the authors, a human designed would choose to merge tex2html_wrap_inline5211 and tex2html_wrap_inline5213 together first, since tex2html_wrap_inline5211 should be merged to tex2html_wrap_inline5213 if it ever merges with a cell. The authors have reported better results with tex2html_wrap_inline5209 over tex2html_wrap_inline5219 .

    Comments :

    1. The objective function used by the authors seems to be a good objective function. The first term tries to maximize the connectivity within the clusters. This is in essence the theme of the schemes developed in this report.
    2. The scheme imposes a penalty for making clusters too big. Thus it prefers the formation of balanced clusters. The size of the clusters formed is approximately the same. This was not the aim of any of the schemes developed in this report. In this report, the problem was formulated in a way to maximize the connectivity within the clusters. The size of the cluster was never a concern. Hence that particular parameter was neglected while developing the schemes.
    3. This particular objective function also seems to capture all the essential information at all the levels. Hence it is suitable for a multilevel formulation.
  5. Scheme 5

    These scheme is due to Ng et.al [13]. According to [14], there is a power-law relationship between the number of edges incident to a cluster and the size of the cluster, which is given by tex2html_wrap_inline5233 where, tex2html_wrap_inline5235 .

    Here d(C) represents the average degree of the modules in the cluster C. Based on this rule, Ng[13] have proposed an agglomerative method. Here the candidates for merging viz., tex2html_wrap_inline5057 and tex2html_wrap_inline5059 are chosen so as to minimize the Rent Parameter of the merged cluster C, which is given by,

    Minimize tex2html_wrap_inline5243 .

    Here the merging process continues until the prescribed number k of clusters is reached or until r exceeds a prescribed constant.

    Comments :

    1. Here the objective function is derived from an average case phenomenon in ``good'' circuit placements, which is given by the Rent's rule as described above. The objective function seems to be strong enough to capture the important enough at each stage of the clustering, making it a good choice for the multilevel formulation.
    2. The scheme in essence seems exactly similar to the schemes developed in this report. The schemes developed in this report characterize the clusters by means of an input parameter names tex2html_wrap_inline3003 . In the above mentioned scheme, the cluster is characterized by means of the rent parameter. Essentially, the desired clustering can be obtained by having the desired rent parameter as an input to the code viz. tex2html_wrap_inline5251 . So all the clusters have to have the tex2html_wrap_inline5253 .

Next: Hierarchical Clustering Up: Bottom Up Approach Previous: Bottom Up Approach

Sushrut S Karanjkar
Tue Apr 21 17:00:32 CDT 1998

The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.