大概有50人加入吧,成功坚持下来的朋友们累积了 200多文献阅读笔记



本次更新的《BRCA分型之PAM50》为2019 第十周分享

    乳腺癌是高度异质性疾病,临床分期及病理分级相同的患者对治疗的反应和预后大不相同。 但是目前仍然是根据临床病理特点如HER2表达、雌激素受体状态、肿瘤大小、分级和淋巴结转移等选择辅助治疗,包括化疗,内分泌治疗,抗HER2治疗等。


    最值得学习的是美国FDA批准的两多基因检测系统是Oncotype Dx 21基因检测和MammaPrint 70基因检测,当然其他科研工作着的尝试也值得回顾:

  • three variants of the Single Sample Predictor (SSP) (SSP2003 [10], SSP2006 [11] and PAM50 [12])

  • Subtype Classification Model (SCM) (SCMOD1 [7] and SCMOD2 [8]), and the simple three-gene model (SCMGENE [9])


    目前看来,最出名的分类就是PAM50,其文章是:J Clin Oncol. 2009 Mar 10; 27(8): 1160–1167. 已经有了近3000的引用,该研究使用了189 prototype samples 的芯片表达数据得到了一个 50-gene subtype predictor,得到的分类是:gene expression–based “intrinsic” subtypes luminal A, luminal B, HER2-enriched, and basal-like.


Test sets from 761 patients (no systemic therapy) were evaluated for prognosis, and 133 patients were evaluated for prediction of pathologic complete response (pCR) to a taxane and anthracycline regimen.

    使用的基因表达芯片很小众:Agilent human 1Av2 microarrays or custom-designed Agilent human 22k arrays , 数据集上传了: GSE10886.


    走我们的GEO教程(视频+代码 https://github.com/jmzeng1314/GEO ) 可以处理这个  GSE10886. 数据集。

  gset <- getGEO('GSE10886', destdir=".",
                 AnnotGPL = F,     ## 注释文件
                 getGPL = F)       ## 平台文件
  save(gset,file=f)   ## 保存到本地
load('GSE10886_eSet.Rdata')  ## 载入数据
class(gset)  #查看数据类型
The centroids, gene lists, and R code to produce the classification are all available along with the clinical information for the training set on this page: https://genome.unc.edu/pubsup/breastGEO/

Specifically, the R code and supporting data files are here: https://genome.unc.edu/pubsup/breastGEO/PAM50.zip

And the centroids alone are here: https://genome.unc.edu/pubsup/breastGEO/pam50_centroids.txt

In addition, this document provides additional information regarding classification of the PAM50 plus Claudin-low calls https://genome.unc.edu/pubsup/breastGEO/Guide%20to%20Intrinsic%20Subtyping%209-6-10.pdf

Anyone running PAM50 (or any classifier based on relative measurements such as expression) should understand the concepts in this paper: http://www.breast-cancer-research.com/content/pdf/s13058-015-0520-4.pdf

You can download PAM50 gene set, Sorlie500 gene set and Hu306 gene set from the sup data of this paper. Breast cancer molecular profiling with single sample predictors: a retrospective analysis.http://www.ncbi.nlm.nih.gov/pubmed/20181526 Or with the genefu Package from Bioconductorhttp://www.bioconductor.org/packages/2.12/bioc/manuals/genefu/man/genefu.pdf Hope this helps


首先是:A Comparison of PAM50 Intrinsic Subtyping with Immunohistochemistry and Clinical Prognostic Factors in Tamoxifen-Treated Estrogen Receptor–Positive Breast Cancer   只针对 estrogen receptor (ER)–positive breast cancers  然后 clinical,  immunohistochemical (IHC), PAM50的分类。

比如:BMC Medical Genomics  2012  https://doi.org/10.1186/1755-8794-5-44 比较了PAM50和IHC结果的一致性。标题是:PAM50 Breast Cancer Subtyping by RT-qPCR and Concordance with Standard Clinical Molecular Markers

还有:It has recently been proposed that a three-gene model (SCMGENE) that measures ESR1,
ERBB2, and AURKA identifies the major breast cancer intrinsic subtypes and provides
robust discrimination for clinical use in a manner very similar to a 50-gene subtype predictor


    通常我们有了转录组表达量信息,就可以使用PAM50分类器来判断乳腺癌的亚型。但假设我们同时也有病人的其它指标,比如age, ER,PR, HER2 and Ki67 status等等,就可以使用机器学习模型来根据这些指标训练模型关联到其PAM50分类值。

    这样如果我们有新的病人,虽然他们可能不会有转录组表达量信息,但是一般病人都会有age, ER,PR, HER2 and Ki67 status这样的指标值,就可以使用训练好的模型来预测其PAM50分类情况。


  • Determining breast cancer histological grade from RNA-sequencing data. Breast Cancer Res. 2016

  • Assessment of breast cancer risk factors reveals subtype heterogeneity Cancer Res. 2017


    文章发表于2010,题目有点长 A comparison of PAM50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor-positive breast cancer.

    作者用的是原始的Quantitative real-time reverse transcription-PCR (qRT-PCR) 有针对性的只测定 50个基因的表达量,选取了 786 个病人,他们可以根据PAM50进行分类,然后在补充材料里面,给出了11-gene proliferation signatures 和 8-gene luminal signature



There are five main intrinsic or molecular subtypes of breast cancer that are based on the genes a cancer expresses:

  • Luminal A breast cancer is hormone-receptor positive (estrogen-receptor and/or progesterone-receptor positive), HER2 negative, and has low levels of the protein Ki-67, which helps control how fast cancer cells grow. Luminal A cancers are low-grade, tend to grow slowly and have the best prognosis.

  • Luminal B breast cancer is hormone-receptor positive (estrogen-receptor and/or progesterone-receptor positive), and either HER2 positive or HER2 negative with high levels of Ki-67. Luminal B cancers generally grow slightly faster than luminal A cancers and their prognosis is slightly worse.

  • Triple-negative/basal-like breast cancer is hormone-receptor negative (estrogen-receptor and progesterone-receptor negative) and HER2 negative. This type of cancer is more common in women with BRCA1 gene mutations. Researchers aren’t sure why, but this type of cancer also is more common among younger and African-American women.

  • HER2-enriched breast cancer is hormone-receptor negative (estrogen-receptor and progesterone-receptor negative) and HER2 positive. HER2-enriched cancers tend to grow faster than luminal cancers and can have a worse prognosis, but they are often successfully treated with targeted therapies aimed at the HER2 protein, such as Herceptin (chemical name: trastuzumab), Perjeta (chemical name: pertuzumab), Tykerb (chemical name: lapatinib), and Kadcyla (chemical name: T-DM1 or ado-trastuzumab emtansine).

  • Normal-like breast cancer is similar to luminal A disease: hormone-receptor positive (estrogen-receptor and/or progesterone-receptor positive), HER2 negative, and has low levels of the protein Ki-67, which helps control how fast cancer cells grow. Still, while normal-like breast cancer has a good prognosis, its prognosis is slightly worse than luminal A cancer’s prognosis.




