Evaluation of Diagnostic Tests

Topics:

Book and resources:

Diagnostic tests

A diagnostic test screens people for a disease, such as the PCR or antigen tests for COVID-19.

  • Each person taking the test either have, or not have the disease.
  • The test result can be positive: classifying the person as having the disease,
  • or negative: classifying the person as not having the disease.

However, the test result may, or may not match the person’s actual status.

Confusion matrix

Denote the subjects with the condition as \(P\), and subjects without the condition as \(N\).

Total population: \(P+N\)

Prevalence (of the condition) is defined as: \(\frac{P}{P+N}\)

Predicted Positive Predicted Negative Total
With condition True Positive TP False Negative FN \(P = TP + FN\)
Without condition False Positive FP True Negative TN \(N=FP + TN\)
Total \(P_{\text{predicted}}=TP+FP\) \(N_{\text{predicted}}=FN + TN\)

Sensitivity of a diagnostic test is the probability of revealing that a person has the condition. It is also known as true positive rate (TPR), as it is the proportion of true positives.

\[\text{Sensitivity}= \frac{TP}{P} = \frac{TP}{TP+FN}\]

Specificity is the probability of revealing that a person does not have the condition (i.e. healthy). It is also known as true negative rate (TNR), as it is the proportion of true negatives.

\[\text{Specificity} = \frac{TN}{N} = \frac{TN}{TN + FP}\]

Positive predictive value (PPV): probability that the person with the condition was given a positive test result

\[PPV = \frac{TP}{P_{predicted}} = \frac{TP}{TP + FP}\]

Negative predictive value (NPV): probability that the person without the condition (i.e. healthy) was give a negative test result.

\[NPV = \frac{TN}{N_{predicted}} = \frac{TN}{TN + FN}\]

(From the Norwegian Medical Journal, 1990) 372 women with a lump in the breast initially classified as malign or benign were referred to a surgical clinic for a final diagnosis.

Mammography malign Mammography benign
Final diagnosis malign 22 3
Final diagnosis benign 16 331

We identify the positive test result (mammography malign), and the positive condition (final diagnosis malign). Then we can compute the four following probabilities from the table:

Sensitivity: \(22/(22+3) = 0.88\)

Specificity: \(331/(331+16) = 0.95\)

Positive predictive value: \(22/(22+16) = 0.58\)

Negative predictive value: \(331/(331+3) = 0.99\)

Diagnostic tests and prevalence

The concepts in diagnostic testing can be formulated in the form of conditional probabilities:

  • Sensitivity: \(P(\text{pos}|\text{ill})\)
  • Specificity: \(P(\text{neg}|\text{healthy})\)
  • Positive predictive value: \(P(\text{ill}|\text{pos})\)
  • Negative predictive value: \(P(\text{healthy}|\text{neg})\)

Bayes’ theorem can be applied to compute PPV and NPV from sensitivity, specificity and prevalence:

\[PPV = \frac{\text{sens} \cdot \text{prev}}{\text{sens} \cdot \text{prev} + (1-\text{spec}) \cdot (1-\text{prev})}\]

\[NPV = \frac{\text{spec} \cdot (1-\text{prev}) }{(1-\text{sens}) \cdot \text{prev} + \text{spec} \cdot (1-\text{prev})}\]

\[ \begin{aligned}PPV = P(\text{ill}|\text{pos}) & = \frac{P(\text{pos}|\text{ill}) \cdot P(\text{ill})}{P(\text{pos})}\\ & = \frac{P(\text{pos}|\text{ill}) \cdot P(\text{ill})}{P(\text{pos}|\text{ill}) \cdot P(\text{ill}) + P(\text{pos}|\text{healthy}) \cdot P(\text{healthy})}\\ & =\frac{\text{sens} \cdot \text{prev}}{\text{sens} \cdot \text{prev} + (1-\text{spec}) \cdot (1-\text{prev})}\end{aligned} \]

Exercise: Try to derive the formula for \(NPV\) in a similar way.

In a test for the HIV virus, the result can be:

  • Positive: the test shows antibodies.
  • Negative: the test does not show antibodies.

But the test result may be wrong:

  • A false positive might come from antibodies from related virus, but not HIV.

  • A false negative might be due to the fact that antibodies are not yet produced in sufficient quantity, hence are not detected by the test.

We assume that the sensitivity of the HIV test is 98%. We know that the specificity of the HIV test is 99.8%. We assume that the prevalence of HIV in a given population is 0.1%.

What is the probability of a person having HIV, if he got a positive test result?

\(PPV = \frac{\text{sens} \cdot \text{prev}}{\text{sens} \cdot \text{prev} + (1-\text{spec}) \cdot (1-\text{prev})} = \frac{0.98 \times 0.001}{0.98 \times 0.001 + 0.002 \times 0.999} = 0.329\)

What is the probability of a person not having HIV, if he got a negative test result?

\(NPV = \frac{\text{spec} \cdot (1-\text{prev}) }{(1-\text{sens}) \cdot \text{prev} + \text{spec} \cdot (1-\text{prev})} = \frac{0.998 \times 0.999}{0.98 \times 0.001 + 0.998 \times 0.999} = 0.999\)

Let us see from 100 000 persons, what are the theoretical results of the test?

  • Number of HIV infected (positive condition): \(100000 \times 0.001 = 100\)

  • True positives: \(100 \times 0.98 = 98\)

  • False negatives: 2

  • Number of not HIV infected (negative condition): \(100000-100 = 99900\)

  • True negatives: \(99900 \times 0.998 = 99700\)

  • False positives: \(200\)

Note that from 298 positive tests, only 98 persons have HIV. How would these numbers change if the prevalence would be lower? and if it would be higher?