Model-based Adaptive Grouping of Subclones

Decompose subclonal structure in tumors from bulk-sequencing data. It reduces the required sequencing depth from 300x to 30x in order to achieve accurate and reproducible identifications.

  • Incorporate a robust error model to account depth-variance and frequency-variance dependencies in bulk-sequencing data.
  • Suitable for samples sequenced at depth as low as 30x.
  • Cluster both SNVs & CNVs.
  • Use single or multiple samples from the same tumor.
  • Report confidence scores for each cluster and each SNV assignment.
  • R implementation of the MAGOS algorithm is available at
Manuscript is currently under review in Bioinformatics.
Preprint is available at BioRxiv (

ABSTRACT: Understanding intratumor heterogeneity is critical to designing personalized treatments and improving clinical outcomes of cancers. Such investigations require accurate delineation of the subclonal composition of a tumor, which to date can only be reliably inferred from deep-sequencing data (>300x depth). To enable accurate subclonal discovery in tumors sequenced at standard depths (30-50x), we develop a novel computational method that incorporates an adaptive error model into statistical decomposition of mixed populations to unravel biological variances from technical variances. Tested on extensive computer simulations and real-world data, this new method, named model-based adaptive grouping of subclones (MAGOS), consistently outperforms existing methods on minimum sequencing depth, decomposition accuracy and computation efficiency. MAGOS supports subclone analysis using single nucleotide variants and copy number variants from one or more samples of an individual tumor. Applications of MAGOS to whole-exome sequencing data of 477 melanoma samples discovered a significant association between subclonal diversity and patient overall survival.