Web EndMemo

R Clustering Tree Plot


Let's first have a look of our data file named clustering.csv:

elements	S1	S2	S3	S4	S5	S6	S7	S8
R1	-0.0027	0.1057	0.1976	0.0209	0	0.0089	0.0082	0.0209
R2	0	-0.1204	0.2627	0	0	0.283	0.2076	-0.0158
R3	0	-0.1204	0.2627	0	0	0.283	0.2076	-0.0158
R4	0.0142	0	-0.454	0.0101	-0.0213	-0.0084	-0.0121	0.0083
R5	0	0	-0.2334	0.007	0.4151	0	0.0987	0.021
R6	0.0381	0.0644	0.2302	0	0	-0.0476	0.2432	-0.0069
R7	0.0381	0.0644	0.2302	0	0	-0.0476	0.2432	-0.0069
R8	0.0381	0.0644	0.2302	0	0	-0.0476	0.2432	-0.0069
R9	0.0891	-0.1022	-0.4466	-0.4877	-0.0175	-0.0523	-0.4792	-0.0547
R10	0.0046	-0.1539	-0.4645	0	-0.0282	0	-0.0217	0.017
R11	0.0706	0.028	0.3626	0	0.0196	-0.0094	0.3086	0
R12	0.0311	0.0759	0.2119	0	-0.0022	0	0	0.0117
R13	0.0013	0.0702	-0.3176	0.0152	0.0095	-0.0224	0.2069	0.005
R14	0.0491	0.0525	-0.4329	0.0237	-0.0038	-0.0224	0.2065	0.005
R15	0.0256	0.0579	0.1846	0.0024	0.0029	-0.0165	0.4781	-0.0123
R16	-0.0061	-0.1554	-0.0635	0.0121	-0.0282	0	-0.016	0.017
R17	-0.0061	-0.1554	-0.0635	0.0121	-0.0282	0	-0.016	0.017

A simple unsupervised hierarchical clustering:
>x <- read.csv("clustering.csv", header=T, dec=".",sep=",")
>data.hclust <- hclust(dist(t(x[,2:ncol(x)])),method="complete")
>plot(data.hclust)


Let's add some annotations:
>label <- data.hclust$labels
>for (i in 1:length(label)){
>    if (i %% 2 == 1) {label[i]<- paste("control_",label[i],sep="");}
>}
>data.hclust$labels <- label
>plot(data.hclust,pointsize=15,units="px",
+ main="Hierarchical Clustering",xlab="Samples")
>rect.hclust(data.hclust,k=4,border="blue")
>groups<-cutree(data.hclust,k=4)