自己没有测bulk转录组数据难道不会找公共数据吗

Original 生信技能树单细胞天地 2022-08-10

昨天在《单细胞天地》发表了一个简短的文献分享，见：单细胞转录组研究的同时也可以加上传统bulk转录组，提到了一个可以被使用单细胞数据分析纳入的一个点，就是把bulk转录组差异分析拿到的上下调基因分解到了不同的细胞亚群。

这样就非常好的应用到了单细胞技术的优点，但有粉丝留言提问说自己设计单细胞转录组课题比较早，那个时候没有看到这个 2021 年1月4日，中国医学科学院北京协和医学院朱兰及中国科学院北京基因组研究所杨运桂共同通讯在Nature Communications 在线发表题为“Single-cell transcriptome profiling of the vaginal wall in women with severe anterior vaginal prolapse”的研究论文。但是也很喜欢这个分析图表，该肿么办！

其实完全没有必要犯难啊，大把的公共数据库等你使用呢，这个文章是因为做的疾病比较小众，vaginal wall 说实话我也是看了研究论文才知道有这疾病。如果你就是很普通的癌症数据研究，比如TCGA的系列癌症:

Study_Abbreviation Study_Name
ACC Adrenocortical_carcinoma
BLCA Bladder_Urothelial_Carcinoma
BRCA Breast_invasive_carcinoma
CESC Cervical_squamous_cell_carcinoma_and_endocervical_adenocarcinoma
CHOL Cholangiocarcinoma
COAD Colon_adenocarcinoma
DLBC Lymphoid_Neoplasm_Diffuse_Large_B-cell_Lymphoma
ESCA Esophageal_carcinoma
GBM Glioblastoma_multiforme
HNSC Head_and_Neck_squamous_cell_carcinoma
KICH Kidney_Chromophobe
KIRC Kidney_renal_clear_cell_carcinoma
KIRP Kidney_renal_papillary_cell_carcinoma
LAML Acute_Myeloid_Leukemia
LGG Brain_Lower_Grade_Glioma
LIHC Liver_hepatocellular_carcinoma
LUAD Lung_adenocarcinoma
LUSC Lung_squamous_cell_carcinoma
MESO Mesothelioma
OV Ovarian_serous_cystadenocarcinoma
PAAD Pancreatic_adenocarcinoma
PCPG Pheochromocytoma_and_Paraganglioma
PRAD Prostate_adenocarcinoma
READ Rectum_adenocarcinoma
SARC Sarcoma
SKCM Skin_Cutaneous_Melanoma
STAD Stomach_adenocarcinoma
TGCT Testicular_Germ_Cell_Tumors
THCA Thyroid_carcinoma
THYM Thymoma
UCEC Uterine_Corpus_Endometrial_Carcinoma
UCS Uterine_Carcinosarcoma
UVM Uveal_Melanoma

或者其它热门的非肿瘤疾病，比如神经退行性疾病，免疫相关疾病，心血管等等。为了解决粉丝的这个问题，我可以搜索了一个例子，就是发表在PNAS杂志的2020年10月文章：《Single-nucleus transcriptome analysis reveals dysregulation of angiogenic endothelial cells and neuroprotective glia in Alzheimer’s disease》，链接是：https://www.pnas.org/content/117/41/25800

首先这个研究的10X样品来源于2个分组：AD patients and healthy normal control (NC) ，是 21 prefrontal cortex tissue samples from patients with AD (n = 12) and NC subjects (n = 9) 细胞数量是：We sampled 169,496 nuclei: 90,713 and 78,783 nuclei from AD and NC brain samples, respectively.

普通的质控降维聚类分群和细胞亚群的生物学注释这样的分析是比较简单的，可以看看我们前面的例子：人人都能学会的单细胞聚类分群注释。重要的细胞亚群如下所示：

astrocytes (AQP4+, 11.9 ± 1.4% of total nuclei),
endothelial cells (CLDN5+, 2.3 ± 0.5%),
excitatory neurons (CAMK2A+, 45.2 ± 1.7%),
inhibitory neurons (GAD1+, 14.1 ± 0.9%),
microglia (C3+, 4.7 ± 0.6%),
oligodendrocytes (MBP+, 21.8 ± 2.5%)

这个时候，研究者就把不同的细胞亚群，在AD (n = 12) and NC subjects (n = 9) 组进行差异分析。

使用公共数据

这个时候作者并没有自己做bulk转录组数据，来看具体的细胞亚群差异在bulk时代的表现，而是使用了 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE33000 这个数据集。who performed bulk transcriptome microarray analysis of prefrontal cortical tissues in a large cohort (AD: n = 310; NC: n = 157) .

以及 microarray dataset of AD temporal cortical samples from Webster et al. (GEO accession no. GSE15222)

有意思的是，这两个公共数据的芯片平台都不简单，后面这个是 Sentrix HumanRef-8 Expression BeadChip ，前面那个是Rosetta/Merck Human 44k 1.1 microarray ，一般人想分析它还是有一点难度的。

反而是单细胞数据分析比较简单了

大家可以下载 GSE157827_RAW 这个 1.3G的文件，里面是 AD (n = 12) and NC subjects (n = 9) 的10X单细胞转录组数据结果，可以走Seurat流程，看看能重复出来多少个文章的图表哈！

往期回顾

2020也不差-单细胞天地编辑团队全体成员有话说

应用空间统计学分析空间表达数据

中国测序数据另外一个中心CNGBdb

如果你对单细胞转录组研究感兴趣，但又不知道如何入门，也许你可以关注一下下面的课程

看完记得顺手点个“在看”哦！

生物 | 单细胞 | 转录组丨资料每天都精彩

长按扫码可关注

警察殴打打人学生，舆论撕裂的背后

你手放哪呢，出生啊

薅广电羊毛！100元话费实付94.6元，还有电费96.9充100元！招团长~

警察踢打校园欺凌者：当事人不愿返校，派出所拒收锦旗

疯传！广州地铁突发！警方介入

自己没有测bulk转录组数据难道不会找公共数据吗

使用公共数据

反而是单细胞数据分析比较简单了

您可能也对以下帖子感兴趣

警察殴打打人学生，舆论撕裂的背后

你手放哪呢，出生啊​

薅广电羊毛！100元话费实付94.6元，还有电费96.9充100元！招团长~

警察踢打校园欺凌者：当事人不愿返校，派出所拒收锦旗

疯传！广州地铁突发！警方介入

生成图片，分享到微信朋友圈

自己没有测bulk转录组数据难道不会找公共数据吗

使用公共数据

反而是单细胞数据分析比较简单了

您可能也对以下帖子感兴趣

你手放哪呢，出生啊