Probability and diagnostic testing

In this page you will find some examples from the lecture. You are not required to do your exercises with R, however R can be used as a calculator.

In addition,

When you have prevalence of the disease, the positive and negative predictive value PPV, NPV are

\[PPV = \frac{sens \times prev}{sens \times prev + (1-spec) \times (1 - prev)}\]

\[NPV = \frac{spec \times (1-prev)}{(1-sens) \times prev + spec \times (1-prev)}\]

Coin toss example

You can replicate the coin toss example in the probability section. Modify \(n\) (number of tosses) and \(p\) (probability of having 1 as outcome) to see what happens.


Mammography example

In the mammography example, we are given the counts of patients that fall into one of the four categories:

  • 22 tested cancer really have cancer
  • 331 tested no cancer really do not have cancer
  • 16 tested cancer, but do not have cancer
  • 3 tested no cancer, but really have cancer

According to the terminology introduced in class, we can create four variables to store the data.

tp <- 22
tn <- 331
fp <- 16
fn <- 3

We can find how many patients really have cancer, or not have cancer. Here positive and negative refer to the actual disease status, not thest test result.

# positive means positive condition - real disease status
positives <- tp + fn
positives
[1] 25
negatives <- tn + fp
negatives
[1] 347

Now find the sensitivity, specificity, positive predictive value.

# sensitivity: tp / positives
tp/positives
[1] 0.88
# specificity: tn / negatives
tn/negatives
[1] 0.9538905
# positive predictive value ppv
# tp / positive test
tp / (tp+fp)
[1] 0.5789474

HIV example

The HIV example from the class demonstrates the how prevalence affects metrics of diagnostic tests.

To start, we set prevalence to 0.001.

# prevalence 0.1%
prevalence <- 0.001

We were given some terms that are not sensitivity and specifity, but it is straightfoward to translate into terms we know.

# false positive rate 0.2%
# fpr is 1-specificity
specificity <- 1-0.002
specificity
[1] 0.998
# false negative rate 2%
# fnr is 1-sensitivity
sensitivity <- 1-0.02
sensitivity
[1] 0.98

Find positive predictive value from the formula.

a <- sensitivity * prevalence
b <- sensitivity * prevalence + (1-specificity) * (1-prevalence)
a/b
[1] 0.3290799
# you can skip the step where you name a and b
(sensitivity * prevalence) / (sensitivity * prevalence + (1-specificity) * (1-prevalence))
[1] 0.3290799

Now test out different values of prevalence. Replace the value of prevalence, then run the following code by pressing the button.

You can try the following prevalence values: 0.01%, 0.1%, 1%, 10%.