On Vulnerability Rediscovery

One of my biggest objections to vulnerability discovery and disclosure is simply that there is no reason to believe that we will find the same vulnerabilities that the bad guys find. This is a somewhat unique property of the set that contains the entire world's codebase (given that independent individuals can pick and choose whatever target software they want to review). A QA department doesn't have the problem because the much smaller codebase can actually be assessed.

If I were a statistician, I would calculate the (random) likelihood as a variation of the birthday problem that demonstrates how a room with 23 individuals has about a 50% chance of having 2 individuals with the same birthday. Except this problem involves one or more individuals with one or more vulnerabilities out of a set of all existing vulnerabilities.

In a recent comment on my full disclosure post, Michael Bennett writes:

Vulnerabilities are not discrete
entities, they almost always fall into a class of vulnerability that is
known and some of these classes are easier to test for or exploit. It
is likely that two people will come up with the same or substantially
similar vulnerabilities because easiest will be tried first.

In addition, pjhenry1216 had this to say in comments on Schneier's recent Wired.com article on full disclosure:

There's a statistically greater than random chance that the good guys
and bad guys will uncover the same vulnerabilities. Researchers don't
find random vulnerabilities. It's like electricity in a circuit: path
of least resistance. Researchers look for the weak spots and go from
there. This is true for both the good guys and bad guys.

To which I replied:

The "greater than random" comment is a reasonable assertion (any
thoughts on how to model that? sort of like the "hot hand" in sports, I
think) but I am not convinced it is true in the aggregate – since this
is public bugfinding, the world's codebase is the target and I suspect
it would still be at least close to random. In any case, the follow-on
question is, how high would the overlap need to be in order to be
useful? (There is a study by Ozment on rediscovery that found about 8%
rediscovery rate).

The study I was referring to above is the only study that I am aware of that directly attempted to measure vulnerability rediscovery rates, by Andy Ozment: The Likelihood of Vulnerability Rediscovery and the Social Utility of Vulnerability Hunting. This study looked at rediscovery rates for Microsoft vulnerabilities based on credits in the notification bulletins. The study determined that 7-8% of Microsoft vulnerabilities are rediscovered.

It seems reasonable to suggest that Microsoft is probably a much larger target than other software companies when it comes to bugfinding (though I think this is probably reduced significantly from previous years), and it makes sense that the easiest vulnerabilities are the most likely to be rediscovered.

I guess the real question is whether an 8% success rate is enough.

More here: The Long-Term Impact of Vulnerability Research: Public Welfare