| | BOS | NY | DC | MIA | CHI | SEA | SF | LA | DEN | | BOS | 0 | 206 | 429 | 1504 | 963 | 2976 | 3095 | 2979 | 1949 | | NY | 206 | 0 | 233 | 1308 | 802 | 2815 | 2934 | 2786 | 1771 | | DC | 429 | 233 | 0 | 1075 | 671 | 2684 | 2799 | 2631 | 1616 | | MIA | 1504 | 1308 | 1075 | 0 | 1329 | 3273 | 3053 | 2687 | 2037 | | CHI | 963 | 802 | 671 | 1329 | 0 | 2013 | 2142 | 2054 | 996 | | SEA | 2976 | 2815 | 2684 | 3273 | 2013 | 0 | 808 | 1131 | 1307 | | SF | 3095 | 2934 | 2799 | 3053 | 2142 | 808 | 0 | 379 | 1235 | | LA | 2979 | 2786 | 2631 | 2687 | 2054 | 1131 | 379 | 0 | 1059 | | DEN | 1949 | 1771 | 1616 | 2037 | 996 | 1307 | 1235 | 1059 | 0 | - Start by assigning each item to its own cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters equal the distances (similarities) between the items they contain. - Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one less cluster. - Compute distances (similarities) between the new cluster and each of the old clusters. - Repeat steps 2 and 3 until all items are clustered into a single cluster of size N. Step 3 can be done in different ways, which is what distinguishes single-link from complete-link and average-link clustering. * **single-link clustering** (also called the connectedness or minimum method) = the shortest distance from any member of one cluster to any member of the other cluster. * **complete-link clustering** (also called the diameter or maximum method) = the longest distance from any member of one cluster to any member of the other cluster. * **average-link clustering** = the average distance from any member of one cluster to any member of the other cluster. | | BOS | NY | DC | MIA | CHI | SEA | SF | LA | DEN | | BOS | 0 | **206** | 429 | 1504 | 963 | 2976 | 3095 | 2979 | 1949 | | NY | 206 | 0 | 233 | 1308 | 802 | 2815 | 2934 | 2786 | 1771 | | DC | 429 | 233 | 0 | 1075 | 671 | 2684 | 2799 | 2631 | 1616 | | MIA | 1504 | 1308 | 1075 | 0 | 1329 | 3273 | 3053 | 2687 | 2037 | | CHI | 963 | 802 | 671 | 1329 | 0 | 2013 | 2142 | 2054 | 996 | | SEA | 2976 | 2815 | 2684 | 3273 | 2013 | 0 | 808 | 1131 | 1307 | | SF | 3095 | 2934 | 2799 | 3053 | 2142 | 808 | 0 | 379 | 1235 | | LA | 2979 | 2786 | 2631 | 2687 | 2054 | 1131 | 379 | 0 | 1059 | | DEN | 1949 | 1771 | 1616 | 2037 | 996 | 1307 | 1235 | 1059 | 0 | - 가장 가까운 거리의 도시: BOS 와 NY, 206 - 두 도시를 합하여 BOS/NY로 하고 다시 이를 포함한 도시들 간의 거리를 구함 - single link 방법을 사용한다면 BOS/NY와 DC간의 거리는 223이 됨 (가장 가까운 거리를 클러스터와의 거리로 환산하는 방법이 single link method). 마찬가지로 DEN까지의 거리는 1771이 됨 | | BOS/NY | DC | MIA | CHI | SEA | SF | LA | DEN | | BOS/NY | 0 | 223 | 1308 | 802 | 2815 | 2934 | 2786 | 1771 | | DC | 223 | 0 | 1075 | 671 | 2684 | 2799 | 2631 | 1616 | | MIA | 1308 | 1075 | 0 | 1329 | 3273 | 3053 | 2687 | 2037 | | CHI | 802 | 671 | 1329 | 0 | 2013 | 2142 | 2054 | 996 | | SEA | 2815 | 2684 | 3273 | 2013 | 0 | 808 | 1131 | 1307 | | SF | 2934 | 2799 | 3053 | 2142 | 808 | 0 | 379 | 1235 | | LA | 2786 | 2631 | 2687 | 2054 | 1131 | 379 | 0 | 1059 | | DEN | 1771 | 1616 | 2037 | 996 | 1307 | 1235 | 1059 | 0 | - BOS/NY와 가장 가까운 거리의 도시는 DC이고 거리는 223 - BOS/NY/DC 로 클러스터링하고 이와 다른 도시들, 그리고 각 도시들 간의 거리를 다시 계산 | | BOS/NY/DC | MIA | CHI | SEA | SF | LA | DEN | | BOS/NY/DC | 0 | 1075 | 671 | 2684 | 2799 | 2631 | 1616 | | MIA | 1075 | 0 | 1329 | 3273 | 3053 | 2687 | 2037 | | CHI | 671 | 1329 | 0 | 2013 | 2142 | 2054 | 996 | | SEA | 2684 | 3273 | 2013 | 0 | 808 | 1131 | 1307 | | SF | 2799 | 3053 | 2142 | 808 | 0 | 379 | 1235 | | LA | 2631 | 2687 | 2054 | 1131 | 379 | 0 | 1059 | | DEN | 1616 | 2037 | 996 | 1307 | 1235 | 1059 | 0 | - 위에서 가장 가까운 도시들 간의 거리는 379이고 이는 SF와 LA 간의 거리 - SF/LA로 합치고 다시 계산하여 매트릭스를 구함 | | BOS/NY/DC | MIA | CHI | SEA | SF/LA | DEN | | BOS/NY/DC | 0 | 1075 | 671 | 2684 | 2631 | 1616 | | MIA | 1075 | 0 | 1329 | 3273 | 2687 | 2037 | | CHI | 671 | 1329 | 0 | 2013 | 2054 | 996 | | SEA | 2684 | 3273 | 2013 | 0 | 808 | 1307 | | SF/LA | 2631 | 2687 | 2054 | 808 | 0 | 1059 | | DEN | 1616 | 2037 | 996 | 1307 | 1059 | 0 | - 이제 CHI가 BOS/NY/DC/CHI와 가장 가까움 (671) - BOS/NY/DC/CHI로 병합 | | BOS/NY/ \\ DC/CHI | MIA | SEA | SF/LA | DEN | | BOS/NY/ \\ DC/CHI | 0 | 1075 | 2013 | 2054 | 996 | | MIA | 1075 | 0 | 3273 | 2687 | 2037 | | SEA | 2013 | 3273 | 0 | 808 | 1307 | | SF/LA | 2054 | 2687 | 808 | 0 | 1059 | | DEN | 996 | 2037 | 1307 | 1059 | 0 | - 같은 방법으로 SEA을 SF/LA에 병합 (SF/LA/SEA) | | BOS/NY/ \\ DC/CHI | MIA | SF/LA \\ /SEA | DEN | | BOS/NY/ \\ DC/CHI | 0 | 1075 | 2013 | 996 | | MIA | 1075 | 0 | 2687 | 2037 | | SF/LA/ \\ SEA | 2054 | 2687 | 0 | 1059 | | DEN | 996 | 2037 | 1059 | 0 | | | BOS/NY/DC/ \\ CHI/DEN | MIA | SF/LA/SEA | | BOS/NY/DC/ \\ CHI/DEN | 0 | 1075 | 1059 | | MIA | 1075 | 0 | 2687 | | SF/LA/SEA | 1059 | 2687 | 0 | | | BOS/NY/DC/CHI/ \\ DEN/SF/LA/SEA | MIA | | BOS/NY/DC/CHI/ \\ DEN/SF/LA/SEA | 0 | 1075 | | MIA | 1075 | 0 | {{:hiclus1.gif}} JOHNSON'S HIERARCHICAL CLUSTERING -------------------------------------------------------------------------------- Method: SINGLE_LINK (minimum distance) Type of Data: Dissimilarities Input dataset: cities (D:\Users\Hyo\Documents\UCINET data\Cities\cities) HIERARCHICAL CLUSTERING M S B C D I E S L O N D H E A A F A S Y C I N Level 4 6 7 8 1 2 3 5 9 ----- - - - - - - - - - 206 . . . . XXX . . . 233 . . . . XXXXX . . 379 . . XXX XXXXX . . 671 . . XXX XXXXXXX . 808 . XXXXX XXXXXXX . 996 . XXXXX XXXXXXXXX 1059 . XXXXXXXXXXXXXXX 1075 XXXXXXXXXXXXXXXXX Measures of cluster adequacy 1 2 3 4 5 6 7 ------ ------ ------ ------ ------ ------ ------ 1 Eta -0.284 -0.480 -0.554 -0.657 -0.711 -0.687 -0.151 2 Q -0.133 -0.163 -0.188 -0.203 -0.240 -0.214 -0.033 3 Q-prime -0.152 -0.190 -0.226 -0.254 -0.320 -0.322 -0.065 4 E-I 0.994 0.973 0.961 0.884 0.824 0.625 -0.490 Size of each cluster, expressed as a proportion of the total population clustered 1 2 3 4 5 6 7 8 ----- ----- ----- ----- ----- ----- ----- ----- 1 CL1 0.222 0.333 0.333 0.111 0.111 0.111 0.111 1.000 2 CL2 0.111 0.111 0.111 0.444 0.444 0.333 0.889 3 CL3 0.111 0.111 0.111 0.111 0.333 0.556 4 CL4 0.111 0.111 0.111 0.222 0.111 5 CL5 0.111 0.111 0.222 0.111 6 CL6 0.111 0.111 0.111 7 CL7 0.111 0.111 8 CL8 0.111 Actor-by-Partition indicator matrix saved as dataset Part ---------------------------------------- Running time: 00:00:01 Output generated: 21 11 16 09:10:06 UCINET 6.614 Copyright (c) 1992-2016 Analytic Technologies {{hiclus2.gif}} {{hiclus4.gif}} ====== E.g. 1 ====== 0 206 429 1504 963 2976 3095 2979 1949 206 0 233 1308 802 2815 2934 2786 1771 429 233 0 1075 671 2684 2799 2631 1616 1504 1308 1075 0 1329 3273 3053 2687 2037 963 802 671 1329 0 2013 2142 2054 996 2976 2815 2684 3273 2013 0 808 1131 1307 3095 2934 2799 3053 2142 808 0 379 1235 2979 2786 2631 2687 2054 1131 379 0 1059 1949 1771 1616 2037 996 1307 1235 1059 0 # Prepare Data setwd("d:/rdata") mydata <- read.csv("cities.csv") mydata <- na.omit(mydata) # listwise deletion of missing mydata <- scale(mydata) # standardize variables