RSnake’s Diminutive XSS Worm Contest Revisited

I addressed the XSS worm contest recently. In case you hadn’t noticed, the post was intended to be tongue-in-cheek. I have been meaning to circle back around on it and provide more specific thoughts:

  1. People are going to do these sorts of things regardless of whether we want them to or not. In this case, I don’t feel as strongly about them doing it, because it doesn’t seem to me that a diminutive XSS worm is particularly useful with today’s protection techniques (I will continue to look into this).
  2. All in all, it was probably a net decrease in risk, since finding existing XSS problems in websites today creates even more risk than this PoC exercise. I am guessing that it is essentially the same community doing both, so the distraction was worth it. (Though I am normally pretty much on the same page as kurt wismer on
    these sorts of things, I don’t feel nearly as strongly about this
    particular contest.)
  3. The contest itself was both interesting from a programming level and really pretty boring from a security perspective. There were about 20 pages of comments (I probably read 5-7 pages worth) and most were along the lines of what browser would accept what characters.
  4. Now we have something like 200 specific strings we can look for (or probably more like 10 strings if we used regular expressions) in the wild. I doubt we’ll ever see any of them, even in form.
  5. I think RSnake was grasping for results when he suggested people learned a lot about security. His main result in his paper actually plays off the Samy worm which was much more instructive. I am actually sort of surprised he didn’t come up with some signatures to look for, or perhaps a way to dynamically name posted pages to limit propagation. Or something… anything, really.
  6. I am still having a hard time figuring out how the worm would actually propagate – i.e. what the known inputs are and how many pages/websites would be infected. Feel free to provide guidance. I will be researching this as well). It seems to me the "diminutive worm" is so diminutive that it would rely on some outside mechanism or website design (or user interaction) to actually propagate to multiple people or multiple webpages.

5 comments for “RSnake’s Diminutive XSS Worm Contest Revisited

  1. January 20, 2008 at 11:36 am

    Thanks for the writeup. I have (many) comments, of course.

    Firstly, let me clarify. A signature would be utterly useless, because site specific obfuscation blows that out of the water. If you look at the XSS cheat sheet and the half dozen tools built off that same technology, you’d see that signatures almost without exception fail to stop XSS worms. There are literally hundreds of obfuscation variants out there that would make a signature useless. It may be useful in picking out a single strain, but beyond that? Useless.

    With that in mind, our results were far better than some signature based filter. If you read and understood the paper you’d see that we’ve actually mitigated ANY variant, not just the ones you could build a signature for. That’s a significantly better improvement in security, with very little development cost, like complex regex Boyour Moore algorithms, etc… can be. Read http://ha.ckers.org/blog/20060602/xss-annihilation/ for more details. You build a regex, and I’ll tell you a) it can be defeated or b) it will stop valid, non malicious input. My solution passes both those tests.

    Lastly, worms propagate because you (the victim) see the worm on the attacker’s social networking profile. Your browser then executes it. Because of the worm code your browser posts itself to another page (your own profile). Now the next person comes along and sees your profile, which now has the worm code on it. It then propagates, and so on. Yes, this is similar to how the MySpace worm works, because that’s how all worms of this nature (non exponential persistent XSS based worms) work. Yamanner is another good example, as is the Orkut worm, and half dozen others. I wouldn’t get fixated on the SAMY worm. It really wasn’t that unique other than the fact that it was first. All these worms work in similar fashion. The only reason I mentioned SAMY was because he used a technique that demonstrated one of the two variants (in the wild). Although an XMLHttpRequest worm didn’t win, in couldn’t have worked at least half the time without blind CSRF in SAMY’s case (because he was on the wrong domain half the time and therefor outside the reach of XMLHttpRequest).

    The only XSS worm that I’ve seen that was wildly different was the Nduja worm, because it was the first and only known exponential XSS worm, but ultimately it would be thwarted by the same technique in the paper because it uses blind CSRF for propagation (which we built a defense against in the paper).

    If you didn’t thoroughly read the paper, I’d recommend a re-read of it, as it does cover complete worm mitigation (not just signature based rules, which I must stress will NOT work without issues in the wild): http://ha.ckers.org/xss-worms/

  2. Pete
    January 21, 2008 at 10:54 am

    @RSnake -

    Thanks for the comments. Here are some observations:

    1) re: Signatures – I agree that signatures are not that helpful, but I don’t think that makes them “utterly useless.” After all, at some point, the code must be de-obfuscated in order to work, and that might provide a good place to put a signature. In any case, I was stretching for good things to say about the contest and curious to see whether any of this code ends up in the wild (I suspect it won’t for obvious reasons, but you never know.)

    2) re: The solution in your paper. I guess what I am wondering here is what is unique about the solution and why did you need to run this contest to figure it out? Leveraging same domain restrictions for security is the reason those restrictions exist. Samy’s approach showed a way around it. If you could point to the “a-ha moment” during the contest where this came up, I would enjoy reading about it.

    3) re: propagation. Thanks for the clarification, it is just what I thought. In that case, the propagation described is one step removed from true propagation, right? I think it might be more useful to think of this exercise as “replication” then rather than true worm-like behavior. I also think it is worth evaluating the variants in the wild to ascertain whether they truly are “all the same”. I think there may be linear/exponential propagation differences there.

    Btw, a large portion of your paper seemed targeted toward writing better worms than defending against them.

  3. January 21, 2008 at 7:28 pm

    Obfuscation is not something that is “undone” by the server. It’s browser DOM land. If you were to say, build an application that de-constructed the page in real time, looked at the DOM (in all variants of all browsers) and watched for anything malicious, sure. Regex? No. So for stopping worms it actually is useless unless your regex is intended to stop all HTML and JavaScript period. If that’s the case you have different problems. The cases I am talking about is where companies must allow some or all HTML for their customers to be happy, so a regex would negatively impact their customer base – making it useless.

    Regarding your second question, it’s unique because it’s never been proposed before to my knowledge. If you can point to another reference to using the same origin policy + nonces in JS space + anti framing to stop XSS (both XHR and CSRF based) worms, I’d be happy to read it, but I think you’ll be hard pressed to find such a thing. To answer the other part of that question, it actually became clear that that would be the outcome within the first day of the contest. The other versions of worms we had were too complex to do good analysis of. Said another way, it’s easier to see what’s going on without all the other garbage of worms.

    If there was an a-ha moment it was somewhere in the third day I believe when I was in bed and I realized that I could stop the worms by forcing the same origin policy – only to make it also work for the CSRF based worms, you would have to use nonces in JS space (which is something I haven’t heard people talk about) to prevent hovering iframe techniques.

    In regards to your last question, no, it’s actually really propagating. I think the part you are missing is that more than one person can visit any one page that has beens infected and it stays put on the page as well as copies itself. That makes it exponential growth, instead of just moving from page to page.

    The reason my paper talked about writing better worms was because I had to take into account all the factors of propagation. This is a very complicated topic, and there are lots of moving parts. If you don’t know how it works you can never properly defend against it (which is why I didn’t know how to solve the problem until after I had a bunch of examples staring me in the face).

  4. Pete
    January 23, 2008 at 11:43 pm

    @RSnake -

    1) I don’t see why a social network site couldn’t de-obfuscate code submitted by a user – MySpace did w/ Samy, right? You have a good writeup on how to do this well and signatures might be useful here. Also, there is no need to conflate detection and response; there are plenty of architectures for detection and monitoring that don’t create the problem you suggest.

    2) It seems to me that a guy with your talent could have come up with your solution just using the Samy worm and the Google worm that you used as examples.

    3) It might be worth a look at a paper like “A Taxonomy of Computer Worms” to get a better understanding of traditional techniques on how worms propagate. If all the humans left their input devices after the initial execution of the code, how many more pages would be infected?

  5. January 26, 2008 at 7:53 pm

    Social networking sites are no different from any site anywhere in terms of their ability to render content. De-obfuscation assumes they know everything about every browser everywhere. This is a complex topic, well beyond what a simple regex signature could ever do. If a website could de-obfuscate everything (which is not possible due to the fact that the code can change in place) then this wouldn’t be an interesting conversation. The reality is that it’s not possible, and I can give you probably a half dozen specific real world examples of this. The easiest one to understand and explain is when a bad guy includes off site content, like an image and bases a decision on that off site content, which changes after initial injection. De-obfuscation sees only that it applies an algorithm to something that essentially does nothing. Yes you can block that (assuming you know it could theoretically turn bad, eventually), but it’s non-malicious, and similar techniques are often used by JavaScript authors to prevent theft of their code. So you end up blocking a lot of valid content. That might sound okay to block some good guys, given the benefits of blocking a bad guy, but there are lots of situations where that’s unacceptable or even contractually not possible to do so. There are many complex situations that go beyond the traditional social networking platform, which isn’t academic – it’s just the nature of how complex the web is. Also, filters end up blocking a lot more good guys than bad guys anyway when you try things like this, at least from all the examples I’m aware of. I’ve seen dozens of examples of this used badly in real life because people don’t actually understand the ramification of the filters.

    In regards to your second statement – I technically did have enough information to come up with the concept given the code that was available to me prior to the coding contest. The problem was I simply didn’t come up with the idea until I spent several days on end thinking about it. I often solve tough problems weeks or months after first looking at them. It’s just how my brain works. I have lots of complex problems that I work on, and I rarely can spend more than a few hours thinking about any of them before I have to move on. The more accurate statement you should have said is that “Anyone could have come up with what you (RSnake) came up with, given Samy’s code.” The entire webappsec world had access to the worm code and no one came up with my solution before I did. So that leaves me to believe that the format that the Samy worm exists in was not sufficient for solving the problem (it certainly wasn’t for me). It took slicing apart multiple variants and hacking apart each incorrect solve before I came up with the solution.

    In regards to your last question – XSS worms work nothing like traditional worms. So that paper is not entirely relevant. The only way for an XSS worm to propagate “by itself” via “self-replication” or “self-propagation” is if you had a series of JS rendering robots that infected themselves. XSS worms work by human propagation and interaction, unlike your normal worm. You’re correct that if all humans fell off the planet the XSS worm would just sit there, dormant. If you want to be academic about it you should simply say it doesn’t “self-propagates” but rather “propagates” or “human propagates.” You can chalk this up to another poor security naming convention.

    Anyway, this should help demonstrate yet another reason why the web security world is vastly different than the traditional security world, and applying rules and theories from one to another is often wrong.

Comments are closed.