Covid19 rapid tests and the probability of disease
Test manufacturers provide sensibility and specificity, but it’s the PPV that’s of interest to most of the tests’ users.
Disclaimer: None of the points made in this notebook should be interpreted as professional health advice.
The rapid, antigen-based tests for Covid19 have been reported to have sensibility and specificity values of >90%. However, in most of the cases, there is a more important value: the probability of being infected with Covid19, given a positive result on a rapid test — the positive predictive value (PPV). It has to be computed based on the sensibility and specificity by applying the Bayes’ rule and the result is heavily dependent on the prevalence of the disease within the population of reference (e.g. population of a city).
Intuitively, the PPV is the probability that the person is infected, given that the test is positive. Sensibility is different — it’s the probability of getting a positive result by someone that’s known to be infected. Specificity is the probability of getting a negative result by someone known not to be infected. Sensibility and specificity can be estimated through studies by test manufacturers. PPV has to be computed and also accounts for the number of infected cases in the population of reference (i.e. the prevalence).
In this article I show how to compute the PPV in an example context and I explore how sensible is the PPV to the prevalence.
You can check below for how to express the PPV given the prevalence of disease in the population of reference and the sensibility and specificity of the test:

For the prevalence, we’ll make an estimation for Bucharest on the 23th of January, 2022: 23181 active cases, with a population of 1.8 million.
For the test, we’ll be using the data from a commercially available one: sensibility 98.51% and specificity 99.91%.
Note: The sensibility and specificity values have been estimated in studies based on symptomatic patients. Both the sensibility and specificity are lower for asymptomatic patients.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as snssensibility = 0.9851
specificity = 0.9991
prevalence_example = 0.012878 # 23181 / 1.8e6def ppv(prevalence: np.array, sens: float = sensibility, spec: float = specificity) -> np.array:
return prevalence*sens / (prevalence*sens + (1-prevalence)*(1-spec))print(f"PPV computed in the example scenario: {ppv(prevalence_example)*100:.1f}%")
The resulting PPV is 93.5%. In the example scenario, this result means that the probability of being infected given a positive rapid test is 93.5%, which is significantly below the sensibility (98.51%). Let’s see how the PPV changes depending on the prevalence of the disease in general.
prevalences = np.arange(0, 0.05, 0.0001)
ppvs = ppv(prevalences)
data = pd.DataFrame(data={'Prevalence': prevalences, 'PPV': ppvs})
sns.relplot(data=data, x='Prevalence', y='PPV', kind='line')
plt.axvline(0.005, color='r')

The plot shows how the PPV decreases abruptly once the prevalence goes below 0.5% (which is indicated by the red line). It is therefore important to take into account the prevalence of the disease according to the context in which the PPV needs to be estimated.
The tests manufacturers can only report the estimated sensibility and specificity from their studies. Nevertheless, the PPV is of upmost interest for most of the users of the tests — computing it requires an estimation of prevalence, which highly depends on the context of the user.