Great Firewall Filtering Revealed

Researchers at the University of Cambridge have done some analysis on how the PRC’s Great Firewall (GFW) handles the “blocking” or interruption of web page loading midstream when it detects sensitive keywords related to the day after June 3 and certain religious groups. What they discovered is quite surprising, because it indicates that the mechanism is simple, clever, but at the same time, quite straighforward to circumvent. Read on for a layman’s explanation of the technical paper.

For the non-techie, the simple explanation is that the GFW sends a “TCP reset” packet to both the web server supplying the suspicious page and to the client (ie. your computer) loading it. It’s the equivalent of an “emergency stop” packet usually reserved for situations of bad connectivity so that both sides know to disconnect abruptly.

It appears the GFW in PRC cleverly uses this technique so that it can stymie the loading of pages, and so it does not have to actively make subsequent decisions to drop packets by correlating them to previous ones. In techie terms, having to store the history of what has been sent and received is called “state information” as in the technical state of affairs the router must accumulate. (This is not to be confused with State information as with “state secrets” or “enemies of the state”!)

I say it is clever, because this means you need far fewer computers, processing power and memory to implement effective blocking. In fact, GFW operators could use off-the shelf Cisco (or whatever) routers with no modified firmware whatsoever, and just have a set of machines sit on the side detecting keywords, and sending out “TCP resets.” Simple, effective, and with a low impact for network engineering.

Well the researchers realized that because this “TCP reset” was the sole mechanism for cutting off loading the content, the page information (including sensitive information and all) was still being sent through all the way to your client computer in the PRC! But because of the “TCP reset,” the client was simply shutting down reception of such packets so the Web browser never got the content. That is, they were actually travelling down the cable (or over Wifi) to your locale in the PRC, but the computer was ignoring them.

So in their tests, they said – what if we simply instructed the computer to ignore the “TCP reset” and keep loading. Would it work? The answer is: yes. From their blog:

…the keyword detection is not actually being done in large routers on the borders of the Chinese networks, but in nearby subsidiary machines. When these machines detect the keyword, they do not actually prevent the packet containing the keyword from passing through the main router (this would be horribly complicated to achieve and still allow the router to run at the necessary speed). Instead, these subsiduary machines generate a series of TCP reset packets, which are sent to each end of the connection. When the resets arrive, the end-points assume they are genuine requests from the other end to close the connection — and obey. Hence the censorship occurs.

However, because the original packets are passed through the firewall unscathed, if both of the endpoints were to completely ignore the firewall’s reset packets, then the connection will proceed unhindered! We’ve done some real experiments on this — and it works just fine!! Think of it as the Harry Potter approach to the Great Firewall — just shut your eyes and walk onto Platform 9¾.

Cool results. One problem – you need both the Web server and the client to ignore “TCP reset” packets to make this workaround effective. The researchers have suggested that making this behavior modification to the “TCP/IP stack” of networking code in routers and operating systems was desirable anyway, and they’re probably right. But that’s quite a tall order to get Microsoft, Apple, Palm, Symbian, and all the other folks with IP networking in their OSes to change. (But interestingly, with open source software like Linux, a patch and recompile of the kernel to do this is quite simple.)

Nevertheless, this does provide some insight into how the GFW manages to be effective in keyword blocking given how much traffic the PRC Internet chokepoints have to handle. It’s the network filtering equivalent of Occam’s Razor – the simplest and most straightforward (and low impact) implementation is the most likely.

Researcher Richard Clayton was hopeful about the impact of this discovery:

…the key point is that changing the TCP/IP stacks to ignore the firewall is almost a no-brainer for the vendor. There are excellent technical reasons for discarding the firewall’s resets as a matter of course. If stack builders did this as standard, then an entire Great Firewall of China mechanism entirely fails to work. That can only, in my view, be a good result.

[Hat tip to: Bruce Schneier]

18 thoughts on “Great Firewall Filtering Revealed

  1. Pingback: ha.ckers.org security lab - Archive » Corporate laptop security

  2. Pingback: 混血儿 [Rice Cracker] » Back in action + Secrets of the GFW revealed

  3. Pingback: links for 2006-07-02, by John Biesnecker

  4. Pingback: Global Voices Online » China: One Olympics, One Voice?

  5. Pingback: The Great Firewall of China » The Last Stand | The Houseband

  6. There was once great china wall – visible even from the Moon. Now they invented Great China Firewall. But I discovered Skydur.com – it goes through the wall and I can access all my favorite sites again – youtube, twitter, facebook and hulu ! It’s just about $5 per month but if you signup for the whole year they offer 10% discount – http://www.skydur.com

  7. Using this technology to prevent viruses and attacks on our computer systems is great as long as this technology is not used to limit our freedoms.

  8. So is this kind of like a php redirect that looks for a certain keyword phrase? If so, that would be pretty neat and clever at the same time. I have used PHP redirects before but it’s usually in reference to a static URL.

    Shane
    site owner at Concrete Driveway Cost

  9. My spouse and I stumbled over here from a different page and thought I might check things out. I like what I see so now i’m following you. Look forward to checking out your web page for a second time.

  10. I have been browsing online more than 3 hours today, yet I never found any interesting article like yours. It’s pretty worth enough for me. In my opinion, if all webmasters and bloggers made good content as you did, the web will be a lot more useful than ever before.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>