A recent article from Dr. Hsieh talks about the importance of sensitivity and specificity in imaging.
What Does It Mean To Say A Medical Test Is ‘Sensitive’ Or ‘Specific’?
Paul Hsieh Forbes Contributor
May 29, 2023
Q: What do American radar operators in World War II have in common with modern day patients trying to decide if a whole body MRI scan is worth the money?
A: Both have to be concerned if a blurry dot on a screen might indicate something serious that needs to be dealt with.
In the case of the wartime radar operators, a critical issue was deciding if a “blip” on the screen represented an enemy aircraft, a friendly plane, or just noise. Ideally, a radar system (which includes both the electronic hardware and the human operator) would correctly identify incoming enemy aircraft, while avoiding the twin errors of a false alarm (mistakenly reporting an enemy plane when there wasn’t one) or a miss (failing to detect an enemy aircraft when there actually was one.)
A similar issue arises with any form of medical testing — for example, an MRI image, a home Covid-19 test to detect active infection or a screening mammogram to detect an early breast cancer.
Ideally, a medical test would correctly detect the presence of real disease (a “true positive”) while avoiding a false alarm (“false positive”) or a miss (a “false negative”).
Of course, no medical test is perfect. Physicians and statisticians have ways of describing the accuracy of these tests. In particular, they use two important measures known as “sensitivity” and “specificity.”
Sensitivity refers to the likelihood that a patient who has the disease will also register as positive on the test. So if we have a group of 100 patients with Covid-19, and the test registers positive for 90 of them, we would say the test is 90% sensitive.
A highly sensitive test will correctly pick up most or all of people with the disease. A poorly sensitive test will report a lot of misses — patients who have the disease but who incorrectly register negative (false negatives).
Specificity is the other side of the coin. Specificity refers to the likelihood that a patient without the disease will also register as negative on the test. So if we have 100 healthy people, none of whom have Covid-19, and the test registers negative for 95 of them, we would say the test is 95% specific.
A highly specific test will correctly detect most or all people who do not have the disease. A test with poor specificity will generate a lot of false alarms — people who are actually healthy but who mistakenly register as ill (false positives).
Depending on the clinical scenario, we might prefer to err on the side of false positives or false negatives.
For example, suppose a blood test was 100% sensitive for detecting early breast cancer, but not very specific. Such a test would reliably catch everyone with cancer but also include some false positives. In a screening scenario, we might prefer high sensitivivty to not miss any patients with actual disease, and then rely on further tests to distinguish the true positives from the false alarms.
This can understandably cause some anxiety for patients. For example, hany women who have an initially suspicious screening mammogram feel alarm when they are asked to come back to the radiology office for additional scans (such as an ultrasounds or MRI) which ultimately turn out to be negative. But most cancer physicians consider the anxiety of these initial false positives to be better than a less sensitive test that misses a lot of cancers (false negatives). Furthermore, with a highly sensitive screening test, a negative result allows a concerned patient to breathe easy and feel confident they were truly clear of the disease.
Conversely, suppose a blood test was 100% specific but not very sensitive. Such a test would correctly identify all healthy patients as “negative” but might still mistakenly categorize some actual cancer patients as falsely negative. This might not make the test suitable for screening purposes.
However, it does means that if a patient had a positive test result, they could be sure they had the disease. With a highly specific test, a positive result might help give a hesitant patient the cognitive and emotional certainty that they need to proceed with treatment without reservation or doubt.
For a more in-depth discussion of these issues, as well as the closely related concepts of “positive predictive value” and “negative predictive value,” I highly recommend this excellent graphic novel (PDF version) by Dr. Stefan Tigges, Professor of Radiology at Emory University School of Medicine in the Journal of the American College of Radiology.
(Bonus: Dr. Tigges provides one of the clearest explanation I’ve ever read of so-called “Receiver Operating Characteristic” curves — another concept which harkens back to the old days of radar operators in the 1940s.) Interested readers can find another nice overview of these topics in this article by L. Daniel Maxim and colleagues.
And what does this have to do with high-tech screening tests such as whole-body MRI? These scans are commonly marketed as helping detect early asymptomatic cancers — which can be true.
But according to the Boston Globe, physicians note that “such scans have the potential for false-positive findings that can result in unnecessary testing and procedures with additional risks, such as exposure to radiation from follow-up testing, not to mention additional costs.”
Whole-body MRI screening tests are not typically covered by insurance; instead patients usually must pay out of pocket. Thus, I strongly recommend that anyone considering undergoing a whole-body MRI first discuss the pros and cons with their personal physician to see if the benefits outweigh the disadvantages.
In particular, patients should inquire about the risks of false positives leading to further tests or invasive procedures (such as biopsies) for supposed abnormalities that ultimately turn out to be nothing. The risk-to-benefit ratio may vary greatly from patient to patient based on factors such as family history of cancer or prior environmental exposures. An pricy screening test that makes sense for one patient might not make sense for another.
Mark Twain popularized the saying, “There are three kinds of lies: lies, damned lies, and statistics.” Statistical ideas can be confusing at times. But if patients and physicians learn some key concepts, they will be better able to understand the strengths and weakness of various medical tests, and the right times to use them.