Representation of All-FIT’s results for detected variants. Loh et al. 2019.
The complexity in cellular populations that exist within a tumor specimen is routinely summarized by the single qualitative measure of tumor purity. All-FIT (Allele-Frequency-based Imputation of Tumor Purity) is a method to estimate specimen purity based on the allele frequencies of variants detected in high-depth, targeted, clinical sequencing data when matched-normal control data is not available.
All-FIT computes Akaike Information Criterion weights and cancer cell fractions for a range of somatic and germline mutational models. Through an iterative process, it estimates purity by minimizing a weighted least squared function with respect to purity and provides statistical confidence intervals for its estimates.


Error depth distribution in ultra-deep sequencing of a TP53 locus at 100,000x for transitions (left) and transversions (right). There is a strong concordance between estimates from the beta-binomial model, its NB approximation, and ultra-deep sequencing data. Rabadan et al. 2018.
Backtrack is a robust computational method to discern low-abundance mutations from background error in ultra-deep sequencing data. We have shown that a beta-binomial distribution or aggregate negative binomial (NB) distributions describe PCR amplification error depths.
Backtrack utilizes a statistical multi-sample approach that goes beyond estimating fixed detection thresholds allowed the discovery of variants with high confidence after false discovery correction.

  Genomics Oncology Platform

The platform's workflow for analyzing tumor-only sequencing data. Jalloul et al. 2021.
Patient-matched control DNA is often lacking in clinical settings and systematic quantitative analyses are required for a detailed genomic characterization of a patient’s tumor, including presence of pathogenic germline variants and evidence for loss of heterozygosity (LOH).
Genomics Oncology Platform is a Python-based GUI for the extraction of relevant information and the application of All-FIT and LOHGIC directly on tumor-only sequencing results. This user-friendly, interactive application can infer germline-versus-somatic status of variants in individual tumors by analyzing commonly available sequencing data from commercial or academic assays.


LOHGIC infers mutational status using AIC weights (W). Khiabanian et al. 2018.
LOH-Germline Inference Calculator (LOHGIC) determines mutational status of variants identified in deep-sequencing assays. It also predicts loss of heterozygosity (LOH) and provides additional information on the number of mutated alleles in tumor cells based on specimen’s purity and a variant’s allele frequency, sequencing depth, and ploidy.
Statistical uncertainties inherent to   these parametrs are also considred.


MERIT precisely quantifies ultra-deep sequencing error. Hadigol and Khiabanian 2018.
Mutation Error Rare Identification Toolkit (MERIT) is a comprehensive   pipeline designed for in-depth quantification of erroneous substitutions and small indels in high-throughput sequencing data, specifically, for ultra-deep applications.
MERIT considers the genomic context of the errors, including the nucleotides immediately at their 5’ and 3’, and establishes error rates at 96 possible substitutions as well as four single-base and 16 double-base indels.


Subtype enrichment of TuBA's biclusters in the METABRIC cohort of 1,970 breast tumors. Biclusters of proximally located genes with copy number gains, color-coded according to their chromosomes (left), and the rest arranged according to their serial numbers (right). Singh et al. 2019.
Tunable Biclustering Algorithm (TuBA) is a novel graph-based unsupervised biclustering algorithm, customized to identify alterations in tumors based on the hypothesis that gene pairs relevant to a clinical process share a statistically significant   number of samples with extreme expression.
TuBA identifies samples in pre-determined upper or lower percentile sets whose pairwise comparison identifies gene pairs that share a statistically significant number of samples. Each significant gene pair is illustrated graphically as pairs of nodes connected by an edge that represents the shared samples in their percentile sets.

© Khiabanian Lab 2015

Rutgers University
Rutgers Robert Wood Johnson Medical School
Department of Pathology and Laboratory Medicine

shopify analytics tool