HTTP parser for intrusion detection and web application firewalls
For a couple of months now I've been working on a new HTTP parser (library), which I am designing for use in intrusion detection systems and web application firewalls. I suppose that HTTP parser is not really an adequate name for this library, because it sounds narrow in scope. In truth, the library will cover all the protocols and encodings used in web applications.
The first user of the parser will be the Open Information Security Foundation (OISF), which is currently building a new IDS from scratch (first release expected on December 31st). The parser itself is going to be released under an open source licence and supported long term.
The biggest challenge with a parser like that is the desire to support an entirely passive mode. Whereas normal parsers are free to interpret the input stream in any way they're pleased for as long they appear to get the job done, a passive parser must be able to decipher traffic intended for multiple web servers, and thus also needs to be aware of the quirks in their processing. Also, without the ability to terminate traffic, opportunities for evasion are rife. The really interesting part of this project is figuring out all the possible ways to evade the parser. I think this is the first time that I will have the time to think like an attacker for as long as I need to do the job properly. I am currently experimenting with the idea of parser personalities, whereas the user is allow to tweak exactly how the parser behaves on per-connection basis. This approach makes it possible to use one set of rules for an Apache web server, and another for an IIS web server.
For the first release of the parser to goal is to be able to parse HTTP streams reliably. In the subsequent versions I will work in the parser's security properties (such as the ability to see through evasion attacks).
A couple of weeks ago, at DeepSec in Vienna, I gave a lightning talk about my work. Matt Jonkman kindly allowed me to use some of the time of his own talk. I am attaching the slides here: