Learn more: PMC Disclaimer | PMC Copyright Notice
Disease heterogeneity and personalized prognosis in myeloproliferative neoplasms
Associated Data
Abstract
Background
Myeloproliferative neoplasms (MPN), comprising polycythemia vera, essential thrombocythemia and myelofibrosis, are chronic hematological malignancies with variable progression rates. Genomic characterization of MPN patients offers the potential for personalised diagnosis, risk stratification and management.
Methods
We sequenced coding exons from 69 myeloid cancer genes in 2035 MPN patients, comprehensively annotating driver mutations and copy number changes. We developed a genomic classification for MPNs and multistage prognostic models for predicting individual patient outcomes. Classification and prognostic models were validated on an external cohort.
Results
33 genes carried driver mutations in >4 patients, with JAK2, CALR or MPL mutations being the sole abnormality in 45% patients. The number of driver mutations increased with age and advanced disease. Driver mutations, germline polymorphisms and demographic variables independently predicted whether patients were diagnosed with essential thrombocythemia versus polycythemia vera, and chronic phase disease versus myelofibrosis. We defined 8 genomic subgroups, exhibiting distinct clinical phenotypes, including diagnostic blood counts, risk of leukemic transformation and event-free survival. Integrating 63 clinical and genomic variables, we created prognostic models capable of generating personally-tailored predictions of clinical outcomes in chronic phase MPN or myelofibrosis. Predicted and observed outcomes correlated well using internal cross-validation and an independent external cohort. Even within individual categories of existing prognostic schemas, our models substantially improved predictive accuracy.
Conclusions
Comprehensive genomic characterization identifies distinct genetic subgroups and provides an MPN classification based on causal biological mechanisms. Integration of genomic data with clinical parameters enables personalised predictions of patient outcome and will support management of MPN patients.
Introduction
The myeloproliferative neoplasms (MPNs) are clonal hematopoietic disorders comprising polycythemia vera (PV), characterized by red blood cell over-production; essential thrombocythemia (ET), with elevated platelet counts; and myelofibrosis (MF), defined by bone marrow fibrosis1. PV and ET are chronic phase MPNs, while MF represents advanced disease, diagnosed either de novo or following ET or PV. Current classification schemes distinguish between MPN subtypes using clinical and laboratory features2–5, but there is uncertainty and controversy over where and how to draw dividing lines between them6,7. This debate is not easily resolved since MPNs exist on a phenotypic continuum, with overlapping distributions of hemoglobin levels, platelet counts and extent of marrow fibrosis.
Biologically, MPNs are driven by cardinal driver mutations in JAK2, CALR or MPL. Many patients have additional drivers spanning a wide range of cancer genes, with patient-to-patient variability in the genetic and clonal landscape8,9. Driver mutations correlate with phenotype and prognosis10–12, and mutation order can also influence disease phenotype13,14. This complex genetic landscape likely contributes to heterogeneity in diagnostic features and outcomes in MPNs.
In blood cancers, there has been a progressive shift away from clinical and morphological classification schemes to those based on genomics15, because such categorization relies on causative disease biology. Driver mutations are increasingly important in predicting clinical outcomes, but large, well-characterized cohorts are necessary for accurate prognostic models16. Recent studies have indicated this promise extends to MPNs10,17, but require larger cohorts and comprehensive gene sequencing to provide definitive answers. We report on a cohort of 2035 patients with long-term follow-up data, sequenced for coding mutations in known myeloid cancer genes, copy number changes and germline polymorphisms.
Methods
Study samples
Patient samples were obtained following written informed consent and ethics approval. Cohort, disease classification, and diagnostic review details are provided in the supplementary appendix. Tumor DNA was derived from blood granulocytes, bone marrow mononuclear cells or whole blood. The majority of patients did not have matched germline samples sequenced. The external validation cohort comprised 515 patients. We use the term ‘myelofibrosis’ to encompass both primary MF and post-ET/PV MF.
Sequencing and analyses
Custom RNA bait hybridisation capture for the full coding sequence of 69 genes, genome-wide single nucleotide polymorphisms (SNPs) for copy-number profiling, and germline loci associated with MPN or red cell variation18–20 (Tables S1-S2) was undertaken in 1887 patients. 148 patients underwent whole-exome sequencing, as reported previously8. Further details are provided in the supplementary appendix.
Clinical variables
Baseline laboratory and clinical data from diagnosis were incorporated into prognostic models as detailed in the supplementary appendix. The median duration between diagnosis and sample acquisition was 49 days. Median follow-up was 93.8 months (range 0.03-523) from diagnosis and 72.0 months (range 0.03-360) from time of DNA sampling.
Statistics
Timing of mutation acquisition used Bradley-Terry modelling of pairwise comparisons of clonal fractions in individual patients13. Bayesian network analysis and Dirichlet processes identified genetic associations and subgroups. Random-effects Cox proportional hazards multistate modelling was used for outcome prediction, as detailed in the supplementary appendix.
Study conduct
JG and JN gathered and analysed data in collaboration with coauthors, and together with ARG and PJC designed the study and wrote the paper including the first draft. All authors vouch for the data, analyses and publication.
Results
Spectrum of genomic changes in MPNs
The cohort of 2035 patients comprised 1321, 356 and 309 patients with ET, PV and MF respectively and 49 patients with other MPN diagnoses (Table S3). 33 genes carried driver mutations in ≥5 patients (Fig.1A; Tables S4-S5). JAK2, MPL and CALR accounted for 1831 driver mutations, compared to 1075 across other genes. Loss of heterozygosity (LOH) was frequent for JAK2V617F, especially in PV, but was infrequent for CALR and MPL (Fig.S1).
We identified 45 truncating mutations in the terminal exon of PPM1D in 38 patients (1.5%, Fig.1B), making PPM1D the 8th most commonly mutated gene in MPNs. These mutations have also been detected in solid tumors, and blood from both healthy individuals and patients with breast/ovarian tumors, often after chemotherapy21,22. In our cohort, 10 patients had PPM1D mutations emerge during treatment with hydroxycarbamide, having not been present in an earlier sample. However, PPM1D mutations were also detected at, or within a month of, diagnosis in 20 cases. Analysis of single-cell derived hematopoietic colonies identified mutated-PPM1D in a ‘triple-negative’ (unmutated-JAK2, -CALR or -MPL) ET patient, but also subclonal to JAK2V617F in a PV patient (Fig.1C). These data confirm that PPM1D mutations can occur within the MPN clone and be present at diagnosis, not always indicating age-related clonal hematopoiesis or therapy-related disease evolution.
Mutations in MLL3 (Fig.1A, Table S4) were detected in 20 patients (1%), and were predominantly nonsense or frameshift as reported in AML23. Interestingly, seven had triple-negative MPN, suggesting that MLL3 could be an important tumor suppressor gene in these patients.
There has been interest in whether mutations in JAK2 and MPL outside of known hotspots could be relevant to MPNs24,25. We identified non-canonical variants in JAK2 and MPL in 16 patients with triple-negative ET and 1 patient with triple-negative MF (Fig.1D). Of these, three groups of variants are likely relevant to disease pathogenesis: (i) JAK2R683G and JAK2E627A, reported in acute lymphoblastic leukemia where they result in constitutive JAK2 activation26–28, were identified in two ET patients, one of whom presented in childhood; (ii) JAK2R867 was mutated in 2 ET patients and is associated with familial thrombocytosis29; (iii) MPLS505N and MPLS204P were identified in 4 and 5 ET patients respectively24. MPLS204P co-occurred with 1p-LOH, suggesting a clonal advantage to acquired homozygosity for this variant.
Factors influencing classification into ET, PV or MF
Currently, patients with MPNs are classified as ET, PV or MF based on clinical and laboratory criteria2–5, but the biology underlying these distinctions is incompletely understood. The number of driver mutations per patient was higher in MF than PV or ET (Fig.2A), as previously reported8, and increased with age of the patient (Fig.2B).
The distinction between JAK2V617F-mutated ET and PV rests on whether red cell mass or hematocrit is elevated. We found that acquired driver mutations correlated with hematological parameters (Fig.S2) and were the strongest determinants of a JAK2-mutated chronic phase patient being labeled as ET or PV, although germline genetic background and demographic factors were also relevant (Fig.2C,S2). 9p-LOH, causing JAK2V617F homozygosity, or a high JAK2V617F allele burden predicted a PV phenotype, as did mutated NFE2, a transcription factor critical to erythroid differentiation. Germline polymorphisms associated with red cell variables in the general population were distributed unevenly between ET and PV, with alleles associated with lower hemoglobin and higher platelets enriched in ET (Fig.2C). Furthermore, the JAK2 46/1 haplotype, associated with increased predisposition to MPNs18, predicted for PV (OR 2.3; CI95% 1.7-3.3; p<0.001), partly through increased odds of JAK2V617F homozygosity via 9p-LOH (OR 2.7; CI95% 2.0-3.9; p<0.001). Older age and male sex also increased the odds of PV. These data show that the location of any chronic phase patient on the hemoglobin/red cell mass continuum is influenced by many factors, and that the use of any arbitrary threshold to label patients as ET or PV will fail to discriminate between patients with different underlying biological mechanisms.
Mutations in spliceosome components, epigenetic regulators and the RAS pathway were the strongest predictors of accelerated phase (MF) versus chronic phase (ET or PV) disease, as were male sex, older age and germline loci associated with platelet count and red cell parameters within the normal population (Fig.2D).
The order in which mutations are acquired in MPNs has previously been shown to influence disease phenotype13,14. CALR and MPL mutations were more commonly early events, while mutations including NRAS, TP53, PPM1D and NFE2 were acquired significantly later in disease (Fig.2E,S3). Some of the earlier-occurring mutations in genes such as SF3B1 and DNMT3A, are also associated with age-related clonal hematopoiesis30,31, suggesting that some MPNs could arise from an antecedent asymptomatic clone. In patients with multiple mutations, JAK2V617F was more commonly a secondary event in patients with ET, and an earlier event in those with PV or MF (Fig.S4,S5), confirming and generalizing observations previously shown for JAK2 relative to TET2 or DNMT3A13,14.
Genomic subgroups in MPN
Hematological malignancies may be subclassified using driver mutations that distinguish subgroups of patients32,33, by observing which pairs of genes are either mutually exclusive or co-mutated more frequently than expected. In our cohort, driver mutations showed complex patterns of assortment (Fig.S6). We used Bayesian modelling to identify genomic subgroups of MPNs with maximum within-group similarity and maximum between-group discrimination.
We identified 8 genomic subgroups in MPNs defined by simple rules, with high reproducibility and low ambiguity in classification of individual patients (Fig.3,S7). TP53 mutations, co-occurring with 17p aberrations and del(5q), identified the first subgroup. TP53 mutations often occur later in disease (Fig.2E), but dominate the genomic and clinical features of these patients regardless of the initial MPN driver. Mirroring other blood cancers with TP53 mutations32,34, these patients have a dismal prognosis with a high risk of AML transformation (Hazard Ratio (HR) 15.5, CI95% 7.5-31.4, p<0.001; HRs expressed relative to JAK2-heterozygous subgroup) and early death (HR 2.4, CI95% 1.6-3.6, p<0.001, Fig.3).
The second subgroup was defined by the presence of one or more mutations in 16 myeloid cancer genes, especially chromatin and spliceosome regulators, chr4q-LOH and 7/7q aberrations. This subgroup was enriched for patients with MF (OR 6.52, CI95% 4.9-8.7, p<0.001) and MPN/MDS overlap (including all 7 CMML/atypical CML cases), but also included 8.4% of ET and 11.5% of PV. Patients showed increased risk of MF transformation (HR 5.4, CI95% 2.7-11.0, p<0.001) and inferior event-free survival (EFS), regardless of MPN subtype or MPN phenotypic driver mutation (HR 2.6, CI95% 2.1-3.2, p<0.001). Patients with co-operating mutations in epigenome and splicing regulators have also been identified in MDS35 and AML32, suggesting that these genes identify groups of patients spanning traditional myeloid disease categories.
Patients not identified in the above 2 subgroups are classified by their dominant MPN phenotypic driver mutation. Patients with CALR mutations, significantly associated with 19p-LOH and del(20q), or those with MPL mutations, universally presented with ET or MF. Those with MPL-mutated MF showed an elevated rate of AML transformation (HR 8.6, CI95% 1.4-49.1, p=0.02), but otherwise these two subgroups showed similar clinical course to the JAK2 subgroups. Those with JAK2V617F heterozygosity comprised most of the JAK2-mutated ET patients, but also some PV and MF patients, and had generally favorable outcomes. The JAK2V617F homozygosity subgroup was enriched for NFE2 mutations and patients with PV. MF transformations occurred more frequently in this subgroup (HR 3.0, CI95% 1.3-6.6, p=0.007).
A seventh subgroup (36 patients; 1.8%) had identifiable driver mutations, but not one of the class-defining drivers identified above. These included patients with mutations in genes such as TET2 and DNMT3A, that are not disease-specific, and those with mutations associated with other myeloid malignancies (such as KIT in systemic mastocytosis). The eighth subgroup (192 patients; 9.4%) had no detectable driver mutations and may include patients with either MPNs carrying unidentified drivers or reactive thrombocytosis. Patients were typically young and female, with a diagnosis of ET. This subgroup had a particularly benign outcome, with only 1 case of MF transformation (0.5%) and 2 of AML transformation (1%) during median follow-up of 8.0 years (HR for EFS: 0.56, CI95% 0.38-0.78, p=0.005).
We applied our proposed classification scheme to an external cohort of 270 MPN patients (137 ET, 14 PV and 119 MF) that had sufficient genomic characterization to apply our flow-chart. Similar subgroup proportions were observed in the two cohorts (Fig.S7).
Factors influencing disease progression in MPNs
A key determinant of the management of MPN patients is predicted prognosis. Patients expected to have a benign future clinical course should have treatments aimed at minimizing thrombotic risk; those expected to progress to leukemia or myelofibrotic bone marrow failure may be candidates for intensive therapy or clinical trials of novel agents. To explore which variables predict disease progression, we developed a multivariate statistical model that estimates a patient’s probability of transition between stages of disease, namely chronic phase (ET or PV), accelerated phase (MF), AML and death.
We determined the fraction of explained variability for each outcome attributable to different prognostic factors (Fig.4A). Death in chronic phase was predominantly influenced by age, with genomic features having little predictive power suggesting that once cytoreduction has achieved adequate control of blood counts, causes of death are dominated by those that would also occur in the general population36. These would therefore not be well predicted by the specific genomic features of the MPN.
By contrast, genomic features played a substantial role in predicting progression from chronic phase to MF, and AML transformation (Fig.4A). CALR mutations were independently associated with increased risk of myelofibrotic transformation, as previously reported37. Mutations in epigenetic regulators, splicing factors and RAS-signaling were all predictive of myelofibrotic and leukemic transformation – some, but not all, of these associations have been identified previously10–12. Whether mutations were clonal or subclonal had little impact on prognosis (Supplementary Appendix). Clinical features of the disease, such as anemia, splenomegaly or thrombocytosis, still retained independent predictive power for transformation events suggesting that these variables reflect important features of disease state not captured in the genomic landscape. Outcomes in MF did not significantly differ whether the MF was primary or followed antecedent ET or PV.
Personally tailored prognosis in MPN patients
Current prognostic models for MPNs, focused on MF, use simple scoring systems, grouping patients into broad prognostic categories. As shown above, many factors influence clinical outcomes, with a wide range of effect sizes, meaning that current schemes discard information that is relevant to prognosis. We therefore explored whether our multivariate, multistate prognostic models could be used for individual patient predictions.
The utility of personally tailored predictions can be assessed twofold – do they usefully discriminate between patients and are the predictions more informative than conventional schemas? Regarding the first question, not only is our model able to generate a wide range of specific predictions (from long-term survival, death in chronic phase, myelofibrotic and leukemic transformation), these correlate well with observed outcomes (Fig.4B, ,5,5, S8. Tables S6-S7), both on internal cross-validation as well as for an externally characterized cohort of 515 MPN patients (137 ET, 188 PV and 190 MF). Internal cross-validation demonstrated concordances of 75%-84% for overall survival (OS), event-free survival (EFS, Fig.4B) and AML transformation, and good performance on absolute predictive accuracy (Tables S6-S7). Concordances were similar for the external validation cohort, despite the external cohort being diagnosed at another center, evaluated by different pathologists using different diagnostic criteria, and sequenced in a different facility using a different gene panel from the training cohort (Fig.4B). Thus, the model provides considerable discriminatory power that accurately generalizes to other real-world cohorts. Due to the existence of different diagnostic criteria, the model is not heavily reliant on the exact classification label of the patient. Indeed, removing the distinction between PV and ET, but simply retaining MF versus chronic phase disease, did not reduce the predictive accuracy of the model (Fig.S9).
Our model demonstrated superior performance compared to current major prognostic schemas in clinical use – IPSS38, DIPSS39 and High Molecular Risk10 for MF, and the IPSET score for ET40 (Fig.S9, Tables.S6,S7). Furthermore, we identified substantial heterogeneity in disease outcomes within individual prognostic categories of current prognostic schemas (shown for DIPSS, Fig.S10); this was especially prominent for ‘intermediate risk’ patients allowing for more informative predictions in a group with otherwise uncertain outcomes. This means that not so many patients need be screened before some emerge as having increased risk of poor outcomes (“numbers needed to test” across different scenarios in Table.S8). Inclusion of mutations and chromosomal changes beyond JAK2/CALR/MPL improved the predictive power of our prognostic models by up to 12% as measured by Brier scores and model concordance.
We have implemented a user-friendly calculator of individualized patient outcome online (https://jg738.shinyapps.io/mpn_app/) enabling exploration of patients in our cohort, and the generation of new patient predictions using available clinical, laboratory and genomic features. Further validation of our model using additional MPN cohorts will be important, given the bias towards ET patients in this study.
Discussion
A major challenge is how we use our emerging understanding of the pathogenic complexity of MPNs to identify groups of patients with shared disease biology, such that existing and novel therapies can be better targeted to the most appropriate individuals. Current classification of MPNs suffers from disease heterogeneity within, and clinical overlap between, subtypes. A genomic classification has the virtue of identifying patients with shared causative disease biology, is stable over time, and does not rely on blood count thresholds for the assignment of particular disease labels.
Of 8 MPN subgroups identified, the TP53-mutated group were genomically unstable and had poor outcomes – this same subgroup, with similar clinical implications, has been identified in AML and other hematological malignancies32,34. Likewise, the subgroup of MPNs with mutations in genes regulating chromatin and RNA splicing is mirrored in both MDS35 and AML32. In MPNs, these patients typically have myelofibrosis, although some have ET or PV, and have a relatively poor prognosis. Similar poor outcomes for chromatin/spliceosome subgroups are seen in MDS and AML. This raises the intriguing possibility that these driver mutations define a myeloid cancer of older patients that transcends traditional diagnostic categories.
Our model accurately identifies a minority of chronic phase MPN patients for whom there is substantial risk of disease progression. These patients should be the cohort targeted in clinical trials of novel therapeutic agents since they are the most likely to benefit and the trials will be more efficient if higher-risk patients are preferentially enrolled. Our model can also accurately identify the majority of chronic phase MPN patients who seemingly have a benign outlook at diagnosis. For these patients, experimental therapy would be unnecessary, and a conservative management strategy based on cytoreduction and reduction of vascular risk will suffice to give long-term, event-free survival. MPNs do continue to evolve, however, and it would be an interesting extension of this study to evaluate the opportunities offered by serial genomic profiling to update treatment choices if high-risk genomic changes emerge or if therapy drives further evolution.
Comprehensive gene sequencing of patients with blood cancers is becoming increasingly accessible and routine. Integration of clinical data with diagnostic genome profiling will provide prognostic predictions personally tailored to individual patients. In MPNs, this will empower the clinician and support complex decisions around the choice and intensity of therapy, recruitment into clinical trials and long-term clinical outlook.
Acknowledgements
Supported by the Leukemia and Lymphoma Society of America, Cancer Research UK (including a fellowship to J.N), Bloodwise (including a fellowship to J.G), the Wellcome Trust (including a fellowship to P.J.C), the Kay Kendall Leukaemia Fund (including a fellowship to J.G), the European Haematology Association (research grant to J.N), the Li Ka Shing foundation (D.C.W), and the Medical Research Council, UK. A.M.V. and P.G. were supported by a grant from Associazione Italiana per la Ricerca sul Cancro (AIRC; Milan, Italy), to AIRC-Gruppo Italiano Malattie Mieloproliferative- AGIMM (project #1005). P.G. was supported also by a Progetto Ministero della Salute GR-2011-02352109. Samples were provided by the Cambridge Blood and Stem Cell Biobank, which is supported by the NIHR Cambridge Biomedical Research Centre, Wellcome - MRC Stem Cell Institute and the Cancer Research UK - Cambridge Cancer Centre, UK. We thank members of the Cambridge Blood and Stem Cell Bank (Cambridge) and the Cancer Genome Project laboratory (Hinxton) for technical assistance. We thank clinicians and centres who participated in the PT1 studies and Vorinostat trials (details listed in the supplementary appendix). We thank all patients who participated in this study.