July 09, 2009

Examples of the information collected from SSL handshakes

I've received an email or two asking me about the information I collected using mod_sslhaf, so I decided to make it available for everyone. Here it is:

The file contains a list of unique user agents seen on SSL Labs, each with information on the handshake they used and the protocols and cipher suites they offered to use. For example:

Mozilla/5.0 (iPhone; U; CPU iPhone OS 2_2_1 like Mac OS X; en-us) AppleWebKit/525.18.1 \
(KHTML, like Gecko) Version/3.1.1 Mobile/5H11 Safari/525.20
Handshake: h3
Protocol: 03.01

TLS_RSA_WITH_AES_128_CBC_SHA (0x2f)
TLS_RSA_WITH_RC4_128_SHA (0x05)
TLS_RSA_WITH_RC4_128_MD5 (0x04)
TLS_RSA_WITH_AES_256_CBC_SHA (0x35)
TLS_RSA_WITH_3DES_EDE_CBC_SHA (0x0a)
TLS_RSA_WITH_DES_CBC_SHA (0x09)
TLS_RSA_EXPORT_WITH_RC4_40_MD5 (0x03)
TLS_RSA_EXPORT_WITH_DES40_CBC_SHA (0x08)
TLS_RSA_WITH_NULL_MD5 (0x01)

The information gives insight into how SSL is used in real-life, but it's not reliable enough to support any conclusions about individual clients. There are several problems I need to solve:

  1. Parse User-Agent fields to group related clients.
  2. Record request IP addresses in order to be able to verify the search engine clients are who they say they are.
  3. Record request IP addresses to use them as a mechanism to determine forged User-Agent fields.
  4. Deploy mod_sslhaf to multiple high-traffic sensors, in order to further minimise the possibility of using forged User-Agent fields.

July 02, 2009

The analysis of Googlebot's frugal cipher suite list

Two weeks ago, I announced SSL Labs and my technique for passive SSL cipher suite analysis. It won’t surprise you to learn that I've been carefully observing the cipher suites used in the requests that came to the web site since. (In fact, I announced the site slightly earlier than I had planned because I wanted to get my hands on some real-life data.) One client’s SSL fingerprint immediately caught my attention, because it supported only 4 cipher suites. It was Googlebot.

There were 115 visits from Googlebot in the two-week period, using 5 different User-Agent strings (although Googlebot will sometimes send a request without User-Agent set):

  • Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
  • SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
  • DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
  • Googlebot-Image/1.0
  • Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; feed-id=9430846974815548184)

The first one is by far the most common, although the other ones appear on regular basis. I used reverse DNS to verify that the IP addresses belong to Google, with the exception of one Feedfetcher request, for which I had to use ARIN.

As I’ve already mentioned, Googlebot's SSL fingerprint is quite short:

h2,03.01,010080,04,05,0a

The first token indicates the version of the SSL handshake used. In this case it’s h2, which is a code for the SSL v2 handshake. The second token indicates the highest SSL version a client is willing to support. Googlebot’s choice, 03.01, indicates that is willing to go as far as TLS v1.0. Modern browsers do not support SSL v2.0 so it's generally rare to see a browser use a SSL v2 handshake. Search engines don’t care about security but they do care about accessing as many servers as possible: they’ll compromise and support the weaker protocols.

What follows is the most interesting part: the codes for only 4 cipher suites. They are:

  • SSL_CK_RC4_128_WITH_MD5 (0x010080)
  • SSL_RSA_WITH_RC4_128_MD5 (0x04)
  • SSL_RSA_WITH_RC4_128_SHA (0x05)
  • SSL_RSA_WITH_3DES_EDE_CBC_SHA (0x0a)

The first suite is only valid with SSL v2.0, while the three remaining ones work in SSL v3.0 or TLS v1.0. It's obvious that, unlike with most other SSL clients, the cipher suites on this list were hand-picked. If I would have to guess, I would say that the motivation was to save on bandwidth. It’s likely that all SSL v2.0 servers support the one SSL v2.0 cipher suite, while 3 suites are needed to support the rest of the Internet.

Assuming the reason for such a short list of cipher suites is frugality, I am surprised it doesn’t contain suites with weaker ciphers. A search bot doesn’t really care about security so it could afford to negotiate a weaker cipher and perhaps save some CPU cycles. Similarly, 3DES is significantly slower (than, for example, RC4) so it would be my first candidate for removal if I am concerned with performance. Thus, I am guessing it’s there for interoperability.

It would be interesting to get someone from Google to comment.

Interestingly, my net caught one search engine imposter, who claimed he was Googlebot, but wasn't. While I could have also used a reverse DNS lookup to determine what the imposter wasn’t, in this case I was also able to identify what it was—someone browsing the Internet using a Firefox 2.x browser with and altered User-Agent field. Nice!

July 01, 2009

Improved handling of SSL warnings in Firefox 3.5

Slightly over a year ago I discussed the SSL certificate error handling in Firefox. Where Firefox 2.x allows users to simply click through a warning about an invalid SSL connection, Firefox 3.0.x improves the handling and makes it difficult to access the invalid web site.

My blog post turned out to be quite popular, sparking a lively discussion, which spilled onto the Mozilla's Bugzilla when I filed two bug reports for Firefox:

  1. Exceptions for invalid SSL certificates are too easy to add
  2. Handling of invalid SSL certificates lacks in usability

The first bug report was rejected after a short discussion (still, I was happy to have been heard), but the second lingered on and, one year later, resulted in the change in how Firefox handles invalid SSL certificates. In Firefox 3.5, when you encounter an invalid SSL web site, you get a screen similar to this one:

Notice the improved language. The message now ways "[...] we can't confirm that your connection is secure", instead of "[a site] uses an invalid security certificate" (followed by technical mumbo-jumbo). Clicking the two headings at the bottom uncovers the hidden areas, which contain more information and the button to create an exception:

June 17, 2009

HTTP client fingerprinting using SSL handshake analysis

My first SSL Labs project is about HTTP client fingerprinting when SSL is used. A cipher suite, in SSL, is a collection of cryptographic techniques that defines a secure communication channel. There are hundreds of cipher suites, and they are all built out of a dozen or so basic building blocks: key exchange, encryption and integrity validation algorithms. Different programs often use different cipher suites. By observing the list of supported cipher suites one can determine the maximal communication strength, and often even guess the make of the SSL client on the other side.

Possible uses:

  • System administrators can make informed decisions about which cipher suites to enable in the SSL servers they maintain. The general goal is disable as many of the weak(er) cipher suites, while leaving enough to still make it possible for users to connect.
  • Cross-checking the supported cipher suites with the HTTP client identity offered in the User-Agent header may help uncover some automated attack tools that masquarade themselves as browsers. Although the cipher suites used by an SSL client is completely under its control, such evasive actions require a new layer of SSL expertise. (Whereas it is trivial to spoof the contents of the User-Agent header.)

The proof of concept is an Apache module, which monitors SSL handshakes to extract the supported cipher suites. This approach is quite handy, because you can add the fingerprinting facility to your existing installation and suffer negligible overhead.

June 16, 2009

Security researchers ask Google to enable SSL encryption by default

A group of 38 researchers and privacy advocates sent a letter to Google asking it to enable SSL encryption by default in all its applications. Google has had the always-use-SSL option for a while but, since the feature is disabled by default, only a small number of users is taking advantage of it. Google's response was somewhere along the lines of "we'll give the users security... eventually... if we must".

Although the performance overhead of SSL is negligible for most web sites, the price of security is likely to be significant in Google's case, considering the size of its user base.

June 15, 2009

SSL Labs launches

Recently I've found myself spending more and more time not only thinking about SSL, but also working on several SSL-related projects. SSL is a remarkable technology that is not given nearly as much credit as it deserves. I think this is at least partially because SSL is so commonplace today that people take it for granted. Compared to other security technologies, it is also reasonably easy to configure. But therein lies the danger: SSL is so easy to use that most people have stopped thinking about SSL.

I think there's a large gap between how SSL is used today and how it should be used. Actually, I think we first need to make an effort to understand how SSL is actually used today in order to build on that knowledge. With that, in mind, I decided to start a new web site and use it as a launching point for my SSL-related projects. Fast-forward several months; today, I am happy to announce SSL Labs, which has just launched.

My initial work on HTTP client fingerprinting using SSL handshake analysis is already there (I will write more about it in subsequent posts), but I have several other projects, which I will publish in the following weeks.

May 15, 2009

The death of dual-licensing as a commercial open source strategy

We are witnessing an interesting development: an alliance is being formed to execute a hostile take-over of a successful open source project. Yes, I am talking about MySQL. From the press release:

"Our goal with the Open Database Alliance is to provide a central clearinghouse for MySQL development, to encourage a true open development environment with community participation, and to ensure that MySQL code remains extremely high quality," noted Monty. "Participating members at this stage in the 'Alliance' will have a strong voice in how the organization is structured, and we look forward to collaborating with anyone in the industry that provides or depends on MySQL."

Dual-licensing has been a favourite commercial open source strategy for many, but what we are seeing now may be signalling the end of its popularity. The conventional wisdom, until now, was that people would not fork an open source project for as long its (commercial) owner did a decent job at maintaining it. Now we see that, once a project reaches a certain level of popularity, and the right mix of commercial and personal interests exists, the fork happens anyway. The community takes over, abandoning the project's commercial "host" and moving the code into a new phase of development.

One must not forget, on the other hand, that this is not the first time MySQL had been forked, but that you did not hear about those other forks simply because they were small in scope and generally uninteresting to a wider audience. They did not endanger the project. This time, with one of MySQL's founders participating in the forking effort, there is a real possibility that the fork starts to be perceived as the main development branch.

How did MySQL become so successful?

We often talk about business models, technology, open source and other similar topics that are unavoidable for anyone interested in starting a business today, but we sometimes forget the real reason why products become wildly successful. It's actually rather simple:

  1. You have to have something that people really need.
  2. You have to essentially be the only choice on the market.

Get those two things right, and everything else will follow. MySQL was there, in the right place and at the right time, to fill a critical gap that existed back in the early days of the Web: everyone needed a lightweight database engine that could be used to power Web sites. The fact that MySQL was not open source1 did not matter, and neither did the fact that MySQL lacked many of the features needed to be a proper database2. The features MySQL did have were right for the job and so people used it.


Footnotes:

  1. MySQL was initially free for end-users, but you had to pay for redistribution; ISPs were in a grey zone.
  2. The message from MySQL was, rather amusingly, that only wimps needed transactions (paraphrasing).

March 30, 2009

Security is difficult; open source security sometimes even more so

I have prepared a presentation on Open Source security for the Open Source Specialist Group of British Computer Society (BCS OSSG):

The main aim of the presentation is to give an overview of the current state of security in open source projects. I discuss why security is difficult (hint: it's because few people care), and why security in open source is sometimes even more difficult. At the end, I give a simple 3-point strategy for quick evaluation of the security posture of open source projects.

March 27, 2009

ModSecurity training at OWASP AppSec Europe 2009

Looking at the OWASP AppSec Europe 2009 schedule, I was happy to notice the attendees will be offered a ModSecurity training course. I taught the ModSecurity training course at AppSec Europe last year, but that was essentially a vendor-given session. This year, however, we have a member of community, Christian Folini (also known as the author of REMO) teaching the course, and I think that's a significant step forward!

MY WORK

ABOUT ME

Ivan Ristić is an open source advocate, entrepreneur, writer, programmer and web security specialist. He is the principal author of ModSecurity, the open source web application firewall, and the author of Apache Security, a concise yet comprehensive web security guide for the Apache web server.   [LinkedIn Profile]

My Photo

TWITTER

@ivanristic

    FEEDS