introduction | research | team | publications | software | lab news, etc. | contact
The complexity in cellular populations that exist within a tumor specimen is routinely summarized by the single qualitative measure of tumor purity. All-FIT (Allele-Frequency-based Imputation of Tumor Purity) is a method to estimate specimen purity based on the allele frequencies of variants detected in high-depth, targeted, clinical sequencing data when matched-normal control data is not available.
All-FIT computes Akaike Information Criterion weights and cancer cell fractions for a range of somatic and germline mutational models. Through an iterative process, it estimates purity by minimizing a weighted least squared function with respect to purity and provides statistical confidence intervals for its estimates.
Backtrack is a robust computational method to discern low-abundance mutations from background error in ultra-deep sequencing data. We have shown that a beta-binomial distribution or aggregate negative binomial (NB) distributions describe PCR amplification error depths.
Backtrack utilizes a statistical multi-sample approach that goes beyond estimating fixed detection thresholds allowed the discovery of variants with high confidence after false discovery correction.
Genomics Oncology Platform
Patient-matched control DNA is often lacking in clinical settings and systematic quantitative analyses are required for a detailed genomic characterization of a patient’s tumor, including presence of pathogenic germline variants and evidence for loss of heterozygosity (LOH).
Genomics Oncology Platform is a Python-based GUI for the extraction of relevant information and the application of All-FIT and LOHGIC directly on tumor-only sequencing results. This user-friendly, interactive application can infer germline-versus-somatic status of variants in individual tumors by analyzing commonly available sequencing data from commercial or academic assays.
LOH-Germline Inference Calculator (LOHGIC) determines mutational status of variants identified in deep-sequencing assays. It also predicts loss of heterozygosity (LOH) and provides additional information on the number of mutated alleles in tumor cells based on specimen’s purity and a variant’s allele frequency, sequencing depth, and ploidy.
Statistical uncertainties inherent to these parametrs are also considred.
Mutation Error Rare Identification Toolkit (MERIT) is a comprehensive pipeline designed for in-depth quantification of erroneous substitutions and small indels in high-throughput sequencing data, specifically, for ultra-deep applications.
MERIT considers the genomic context of the errors, including the nucleotides immediately at their 5’ and 3’, and establishes error rates at 96 possible substitutions as well as four single-base and 16 double-base indels.
Tunable Biclustering Algorithm (TuBA) is a novel graph-based unsupervised biclustering algorithm, customized to identify alterations in tumors based on the hypothesis that gene pairs relevant to a clinical process share a statistically significant number of samples with extreme expression.
TuBA identifies samples in pre-determined upper or lower percentile sets whose pairwise comparison identifies gene pairs that share a statistically significant number of samples. Each significant gene pair is illustrated graphically as pairs of nodes connected by an edge that represents the shared samples in their percentile sets.