Next-generation sequencing (NGS) enabled the fast sequencing of hundreds of thousands of human genomes. From the resulting NGS data, genetic variants can be called to better understand germline variations, inherited conditions, and diseases like cancer. The accurate calling of these variants is a critical step on which all downstream analyses rely. Thus, it is crucial to thoroughly evaluate the selected and utilized variant calling software. The evaluation is performed for specific, predefined conditions, such as a defined library preparation and sequencing output. During the evaluation, different parameters are calculated to assess the software’s variant calling performance, such as sensitivity and specificity.
The sensitivity is also known as true positive rate (TPR) or recall. In terms of variant calling, the sensitivity determines the probability that a variant is called, given that the individual carries the variant. It is calculated as
Thus, the sensitivity determines the ratio between the number of true positives and the total number of individuals with the variant of interest.
The specificity is also known as the true negative rate (TNR) or precision. In terms of variant calling, the specificity determines the probability that a variant is not called, given that the individual does not carry the variant. It is calculated as
Thus, the specificity determines the ratio between the number of true negatives and the total number of individuals without the variant of interest.
The confusion matrix in figure 1 explains the connection between the values for true positives, true negatives, false positives, and false negatives. In this matrix, the rows represent whether an individual has a certain variant or not, while the columns represent whether the variant calling algorithm called the variant in the individual or not.
To fill the confusion matrix correctly, we need to know whether a called variant is an actual variant in an individual or not. For this purpose, we use the well-characterized Genome in a Bottle (GIAB) sample HG001 as a reference. For this sample, so-called truth sets for high-confidence calls of single nucleotide variants (SNVs) and small insertions and deletions (indels) are available. These truth sets can be used to benchmark the variant calls from the selected variant calling software. By comparing the predicted variant calls from the chosen software with the variant truth set from HG001, we can determine whether a called variant is an actual variant or a false positive call. Thus, with the help of the HG001 reference sample, we can fill the confusion matrix and subsequently calculate the values for the sensitivity and specificity of our variant calling algorithms.
Figure 1 | Confusion matrix. The rows represent whether an individual has a certain variant or not, while the columns represent, whether the variant calling algorithm called the variant in the individual or not.