8 聚类[Clustering]

8.1 样本-基因热图[Sample-Gene heatmap]

方法[Method]

在基因水平进行无监督的层次聚类，使用的数据是所有样本基因的 \(log_2(FPKM+1)\) 。 4984 个基因用于聚类，筛选条件是 \(1<maximum\{log_2(FPKM+1)\}<20\) 并且标准差 \(sd\{log_2(FPKM+1)\} > 0.3\) 。

Unsupervised hierarchical clustering was carried on \(log_2(FPKM+1)\) across samples. 4984 genes used for clustering, which were selected by \(1<maximum\{log_2(FPKM+1)\}<20\) and the standard deviation \(sd\{log_2(FPKM+1)\} > 0.3\).

Figure 8.1: Unsupervised clustering (samples-genes)

PDF 文件 : Heatmap.jpg.

要阅读热图中详细的基因名称，请参阅较大的视图 PDF PNG.
按层次聚类分成12个组 PDF PNG.
- 分组信息 Heatmap_cutree12_GroupInfo.xls.
- 组内信息 Heatmap_cutree12.xls.
To read the detailed gene names in heatmap, please refer to the larger view PDF PNG.
To check gene names in sub-groups, a 12-cut view is generated PDF PNG.
- The sub-group information can be found at Heatmap_cutree12_GroupInfo.xls.
- The information about genes in each sub-group can be found at Heatmap_cutree12.xls.

8.2 样本-样本热图[Sample-Sample heatmap]

方法[Method]

在样本水平进行无监督的层次聚类，其中距离定义为 \((1-r^2)\), 其中 \(r\) 是上述所选基因 \(log_2(FPKM+1)\) 的相关系数.

Unsupervised hierarchical clustering was samples, where the distance is defined as \((1-r^2)\), where the \(r\) is the correlation coefficient for \(log_2(FPKM+1)\) on above selected genes.