Battling XSS Today …and Tomorrow (Part 2)
by Joe | May 2nd, 2008
Last week (well last post since it’s been a bit of time) we looked at a very common Cross Site Scripting (XSS) scenario and at a quick but powerful countermeasure, http-only cookies. This week we will delve a little deeper into the fundamental vulnerabilities that make XSS such a continuing problem.
Let’s start with this recipe for how to deal with XSS vulnerabilities in general:
- Validate & constrain all user input against a whitelist (don’t just rely out sanitizing it against a blacklist)
- Encode all output that can contain user input
- When metacharacters must be permitted in user input that can become output, make specific exceptions for this in both the input filter and the output encoding. You can do this with the following pattern: make an exception in the general constraint rules on the input; html encode all the excepted input; and specifically reconvert safe/desired html elements (e.g., a limited set of formatting tags) back to unencoded html.
There is a lot of good advice here. For instance, serious security experts generally advise against reliance on blacklists to filter user input, because trying to match raw input against the signatures of known “unsafe” inputs (such as
<script> or <& and >) is bound to leave exploitable holes. See Dinis Cruz’s post on this thread, and Jeremiah Grossman’s discussion here, for some alarming examples.
Of course, it’s quite a bit more trouble to follow the best practice recommended here and build a whitelist of allowable inputs, based on the “known good” validation pattern — but it is really the only way to sure your input filters are not being bypassed.
It’s also worth pointing out that the advice to “encode all output that can contain user input” needs to be applied rigorously — not just in free text areas but even where user input is used to construct attributes inside dynamically-created html. As long as you first encode all output that can contain user input, and then make specific exceptions for the metacharacters you want to allow, you are, in effect, “strongly typing” the dynamically-created html output (whether it comes straight from the user, as in a rich text input, or is in the form of dynamically-built tags with attributes that echo user input).
There are plenty of tools to help out with this kind of work. For .NET enviroments, you can use a combination of general Request validation and the Validation Server Controls to filter input, and the powerful Anti-Cross Site Scripting library to encode output. For PHP, the Open Web Application Security Project (OWASP) has the toolkit for you: PHP Filters on the input side, and either the Encoding Project’s Reform Library or the forthcoming PHP Anti-XSS Library on the output side.
These countermeasures, however, as good as they are, do nothing for the next frontier of XSS exploits: DOM-based XSS. These are “XSS attacks which do not rely on the payload embedded by the server in some response page”. This variety of XSS does not rely on access to the HTTP Entity, which becomes (via the DOM) the body object in the victim’s browser. Instead, the hole is in the fact that the DOM includes object space outside of the body, object space which is therefore populated by parts of the HTTP message that lie outside the Entity Body (for instance, in the URL and the HTTP headers).
worse yet, the use of the number sign to insert a URL fragment into one a document object’s parse tree can potentially prevent the XSS payload from even passing through the server side at all. There are, once again, many subtle variations on the following basic case:
Countermeasures? Well, the first recommendation (from the Web Application Security Consortium) for avoiding this new class of XSS vunlerabilities reminds us of the bleakness of the XSS FAQ’s advice to users: “Avoiding client side document rewriting, redirection, or other sensitive actions, using client side data. Most of these effects can be achieved by using dynamic pages (server side).” That’s fine, except that it sounds like a prescription to avoid Ajax altogether — most likely not a viable option going forward. Can anything more practical be done? Well let’s put it this way: as soon as we hear of a viable way of combatting this XSS actor, we’ll let you know.