An allele is a variant of a gene at a particular genomic location. Allele frequencies are an important aspect of genetic variations. In population genetics, allele frequencies define the frequency of an allele in a given population. One example is the frequency of the AB0 alleles for the different blood types. The allele frequency is calculated as a fraction of the number of times the allele of interest is observed in a population and the population size. In our example in figure 1, the population consists of 50 individuals, of which 6 carry the allele of interest. This results in an allele frequency of 12%. If the allele frequency of a variant is above a certain threshold, often above 1%, the variant is assumed to be a single nucleotide polymorphism (SNP) and not a pathognomonically relevant variant. This means that it is not assumed to be a characteristic of a particular disease. Information about allele frequencies in certain populations can be found in dedicated databases, such as the single nucleotide polymorphism database (dbSNP).
The allele frequency in molecular pathology, however, needs to be clearly distinguished from the allele frequency in population genetics. The allele frequency in molecular pathology focuses on NGS analyses and the proportion of detected mutated alleles. This allele frequency is also called variant allele frequency (VAF), variant allele fraction (VAF), or mutant allele frequency (MAF). The VAF is calculated as the number of mutated molecules over the total number of wild-type molecules at a specific genomic location. In our example in figure 1, 6 of the 50 molecules carry the mutation of interest, resulting in a VAF of 12%.
VAFs can be used to assess a variant‘s origin. For germline variants, variant allele frequencies of 50% are expected for heterozygous variants and 100% for homozygous variants. Variant allele frequencies between less than 1% and 50% can point towards mosaic mutations. In mosaicism, not all germline cells possess the same genetic makeup as pathological changes occur after fertilization, resulting in postzygotic mutations.
In cases of tumor analyses where no matching normal sample is provided, the VAF can be used to infer whether a variant comes from somatic cells or is inherited from the parents. This is particularly interesting for liquid biopsy analyses. In physiological events, such as cell apoptosis, necrosis, or secretion, cell-free DNA (cfDNA) of normal tissue and tumor-derived DNA (ctDNA) are released into the bloodstream. This DNA can be captured, sequenced, and analyzed in liquid biopsy approaches – and the VAF can help assess the origin of a variant. As the cfDNA is released during physiological events, the analysis of this genomic material reveals a more global view across the heterogeneity of the tumor and its potential metastasis compared to tumor tissue biopsy analyses. Additionally, cfDNA has a relatively short half-life of approximately two hours. Therefore, liquid biopsy analyses can also be seen as a “real-time” reflection of the tumor’s genetic status.
However, the amount of cfDNA fragments in the peripheral blood is limited. Additionally, the fraction of tumor molecules within the limited amount of cfDNA is even smaller. This fraction also varies depending on the type of cancer, its stage, and its location. Thus, the VAFs can go down to 0.1%. At such low VAFs, it is difficult to distinguish whether the variant is a tumorderived variant or due to the intrinsic error rate of standard NGS procedures, such as library preparation, cluster formation, or sequencing. Thus, each approach has a limit of detection. Until this limit, VAFs can be determined and distinguished from intrinsic errors. Thus, the liquid biopsy protocol and subsequent analysis need to be chosen carefully to reliably determine a tumor’s VAF.