3 表达矩阵[feature-barcode matrices]

方法[Method]
原始10X单细胞转录组数据使用10X genomics公司提供的Cell Ranger (v3.1.0)进行比对序列、过滤、计数UMI、生成feature-barcode矩阵,并开展聚类和基因差异表达分析等.
Cell Ranger (v3.1.0)is a set of analysis pipelines that process Chromium single cell 3′ RNA-seq data. The pipelines process raw sequencing output, performs read alignment, generate gene-cell matrices, and can perform downstream analyses such as clustering and gene expression analysis.

P Cell Ranger结果[Cell Ranger Result]: P-web_summary.html

T Cell Ranger结果[Cell Ranger Result]: T-web_summary.html

Cell Ranger结果说明[[Cell Ranger Result Analysis]:

Estimated number of cells:对于这个样本测到的细胞数

Mean reads per cell:每个细胞测到的平均reads

Median genes per cell:每个细胞基因数的中间数,一般大于1000

Sequencing:

Number of reads:测到的总read数目

Valid barcodes:UMI校正后匹配的UMI数量,一般大于90%,重要!

Sequencing saturation:测序饱和度,一般60-80%比较合适

Q30 bases in barcode: 基于barcode的分数,大于30的比率,一般大于85%

Q30 bases in RNA read:基于RNA read的分数,大于30的比率

Q30 bases in UMI:基于UMI的分数,大于30的比率

Mapping:

Reads mapped to genome :比对到选定基因组的reads

Reands mapped confidently to genome: 仅仅比对到基因组的reads,如果一条reads既可以比对到外显子区又可以比对到非外显子区,那么算比对到了其中一个外显子区

Reads mapped confidently to intergenic regions: 比对到基因组的基因间区域

Reads mapped confidently to intronic regions:比对到内含子区域

Reads mapped confidently to exonic regions:比对到外显子区域

Reads mapped confidently to transcriptome:比对到转录组的reads,这些读数可以用来UMI的计数

Reads mapped antisense to gene: 比对到基因的相反的reads

Cells:

Estimated number of cells:对于这个样本测到的细胞数

Fraction reads in cells:这项是valid-UMI的质量分数,表示与细胞相关的UMI可靠地比对到基因组,一般要在70%及以上,否则数据质量就不好,重要!

Mean reads per cell: 每个细胞测到的平均reads

Median genes per cell:每个细胞的中间基因数

Total genes detected:测到的总基因数,至少有一条UMI

Median UMI counts per cell:细胞UMI数量的中间值

Analysis:

细胞表达量分布的t-SNE图:UMI标签用于标识转录本,UMI的count值就是转录本的表达量,采用tSNE降维算法, 对细胞的表达量进行可视化,每个点代表一个细胞

细胞亚型:根据表达量对细胞进行聚类,从而识别细胞亚型,提供了两种聚类算法graph-based和k-means

基因差异表达分析:对cluster下的基因进行差异分析,将细胞分成了该cluster和其他cluster两类,然后进行差异分析

饱和度评估:对reads抽样,观察不同抽样条件下检测到的转录本数量占检测到的所有转录本的比例
对reads抽样,观察不同测序数据量情况下检测到的基因数目的分布

t-SNE diagram of cell expression level distribution: UMI tag is used to identify the transcript, UMI count value is the transcript expression level, tSNE dimension reduction algorithm is used to visualize the cell expression level, each point represents a Cell

Cell subtypes: Cluster cells based on expression levels to identify cell subtypes. Two clustering algorithms, graph-based and k-means, are provided.

Differential expression analysis of genes: Differential analysis of genes under the cluster, the cells are divided into this cluster and other clusters, and then differential analysis is performed

Saturation assessment: Sampling reads and observing the proportion of transcripts detected under different sampling conditions to all transcripts detected Sampling reads and observing the distribution of the number of genes detected with different amounts of sequencing data


P 下载表达矩阵[download matrices]: P-filtered_feature_bc_matrix.zip

T 下载表达矩阵[download matrices]: T-filtered_feature_bc_matrix.zip