User Tools

Site Tools


johnson_s_hierarchical_clustering

This is an old revision of the document!


Cities Boston Chicago Denver LosAngeles New York San Francisco Seattle Washington
Boston, Mass. - 851 1769 2596 188 2699 2493 393
Chicago, Ill. 851 - 920 1745 713 1858 1737 597
Denver, Colo. 1769 920 - 831 1631 949 1021 1494
Los Angeles, Calif. 2596 1745 831 - 2451 347 959 2300
New York, N.Y. 188 713 1631 2451 - 2571 2408 205
San Francisco, Calif. 2699 1858 949 347 2571 - 678 2442
Seattle, Wash. 2493 1737 1021 959 2408 678 - 2329
Washington, D.C. 393 597 1494 2300 205 2442 2329 -

  1. Start by assigning each item to its own cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters equal the distances (similarities) between the items they contain.
  2. Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one less cluster.
  3. Compute distances (similarities) between the new cluster and each of the old clusters.
  4. Repeat steps 2 and 3 until all items are clustered into a single cluster of size N.

Step 3 can be done in different ways, which is what distinguishes single-link from complete-link and average-link clustering.

  • single-link clustering (also called the connectedness or minimum method) = the shortest distance from any member of one cluster to any member of the other cluster.
  • complete-link clustering (also called the diameter or maximum method) = the longest distance from any member of one cluster to any member of the other cluster.
  • average-link clustering = the average distance from any member of one cluster to any member of the other cluster.
BOS NY DC MIA CHI SEA SF LA DEN
BOS 0 206 429 1504 963 2976 3095 2979 1949
NY 206 0 233 1308 802 2815 2934 2786 1771
DC 429 233 0 1075 671 2684 2799 2631 1616
MIA 1504 1308 1075 0 1329 3273 3053 2687 2037
CHI 963 802 671 1329 0 2013 2142 2054 996
SEA 2976 2815 2684 3273 2013 0 808 1131 1307
SF 3095 2934 2799 3053 2142 808 0 379 1235
LA 2979 2786 2631 2687 2054 1131 379 0 1059
DEN 1949 1771 1616 2037 996 1307 1235 1059 0
BOS/NY DC MIA CHI SEA SF LA DEN
BOS/NY 0 223 1308 802 2815 2934 2786 1771
DC 223 0 1075 671 2684 2799 2631 1616
MIA 1308 1075 0 1329 3273 3053 2687 2037
CHI 802 671 1329 0 2013 2142 2054 996
SEA 2815 2684 3273 2013 0 808 1131 1307
SF 2934 2799 3053 2142 808 0 379 1235
LA 2786 2631 2687 2054 1131 379 0 1059
DEN 1771 1616 2037 996 1307 1235 1059 0
BOS/NY/DC MIA CHI SEA SF LA DEN
BOS/NY/DC 0 1075 671 2684 2799 2631 1616
MIA 1075 0 1329 3273 3053 2687 2037
CHI 671 1329 0 2013 2142 2054 996
SEA 2684 3273 2013 0 808 1131 1307
SF 2799 3053 2142 808 0 379 1235
LA 2631 2687 2054 1131 379 0 1059
DEN 1616 2037 996 1307 1235 1059 0
BOS/NY/DC MIA CHI SEA SF/LA DEN
BOS/NY/DC 0 1075 671 2684 2631 1616
MIA 1075 0 1329 3273 2687 2037
CHI 671 1329 0 2013 2054 996
SEA 2684 3273 2013 0 808 1307
SF/LA 2631 2687 2054 808 0 1059
DEN 1616 2037 996 1307 1059 0
BOS/NY/DC/CHI MIA SEA SF/LA DEN
BOS/NY/DC/CHI 0 1075 2013 2054 996
MIA 1075 0 3273 2687 2037
SEA 2013 3273 0 808 1307
SF/LA 2054 2687 808 0 1059
DEN 996 2037 1307 1059 0
BOS/NY/DC/CHI MIA SF/LA/SEA DEN
BOS/NY/DC/CHI 0 1075 2013 996
MIA 1075 0 2687 2037
SF/LA/SEA 2054 2687 0 1059
DEN 996 2037 1059 0
BOS/NY/DC/CHI/DEN MIA SF/LA/SEA
BOS/NY/DC/CHI/DEN 0 1075 1059
MIA 1075 0 2687
SF/LA/SEA 1059 2687 0
BOS/NY/DC/CHI/DEN/SF/LA/SEA MIA
BOS/NY/DC/CHI/DEN/SF/LA/SEA 0 1075
MIA 1075 0
johnson_s_hierarchical_clustering.1479685887.txt.gz · Last modified: 2016/11/21 08:21 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki