Recently, the L.A. Times and other places wrote about a study done by Dr. Walter Willett of Harvard, et.al. regarding the impact of red meat on one’s mortality. He found that eating as little as one extra serving of red meat a week contributed to a 13% or 20% increased risk of death. More specifically, they found that
“After multivariate adjustment for major lifestyle and dietary risk factors, the pooled hazard ratio (HR) (95% CI) of total mortality for a 1-serving-per-day increase was 1.13 (1.07-1.20) for unprocessed red meat and 1.20 (1.15-1.24) for processed red meat.”
As with many studies about diet, lifestyle, and death, this one has sparked discussion. The Numbers Guy from the Wall Street Journal, Carl Bialik, wrote two articles on the study itself and the difference between absolute risk and relative risk numbers that often create confusion and annoyance. That article led me to the always excellent Understanding Uncertainty blog post by Dr. David Spiegelhalter’s fuller treatment of exactly what a 13% increased risk of death actually means (dying about a year younger, in case you are wondering). It also provides discussion on correlation/causation caveats and the practical application of the numbers.
All this discussion is interesting and should be useful for any IT risk professional interested in quantitative treatments of risk. But these details are not the reason I am writing this. As I was reviewing the information, it struck me just how difficult this is in the physical world. This quote from Dr. Willett in the L.A. Times article really highlights the problem:
“In principle, the ideal study would take 100,000 people and randomly assign some to eating several servings of red meat a day and randomize the others to not consume red meat and then follow them for several decades. But that study, even with any amount of money, in many instances is simply not possible to do.”
What struck me was not only how hard this is, but also the rigor of the results in the face of the described obstacles. And, even more importantly how much easier this would be for IT risk professionals in the virtual world.
In the virtual world, we actually could design and conduct a study that controlled for almost every variable to quantify risk. We could, for example, deploy 10,000 or 100,000 virtual machine clients around the Internet that were all configured exactly alike with the exception of some specified difference – patched vs. non-patched, different anti-malware solutions and/or signature updates, open vs. closed ports, other configuration changes, etc. About the hardest part would be determining how/where to deploy the VMs and coming up with a “honeymonkey” algorithm to mimic user activity.
Perhaps the biggest challenge would be recognizing and characterizing the intelligent adversary contribution to the variance in the numbers – the popularity of vulnerabilities, exploit techniques, 0days, etc. And that would be the good stuff, as well.
Conducting an experiment like this seems so easy to me that I wonder if somebody is already doing it. I am pretty sure some group (ISC?) used to do some sort of “time-to-compromise” metric for unpatched systems. And I suspect there may be others. Does anyone know of experiments/studies being done similar to this? If so, I’d love to hear about them. If not, why not?