First of all remember this is quite a broad area, and the goals and motivation of the organisation trying to control access to the internet will usually decide what mechanism is used. When I say Organisation, of course this can be anything from the Sudanese Government, Chinese Internet police to the company you work for. All will have different motivations to filter your access to the internet, their goals and indeed their available resources will control which mechanisms they use.
Sometimes this filtering will be quite open, sometimes it may be much more covert - in many situations users don't even know that they are being filtered.
Filtering TCP/IP Headers
An IP packet consists of two main parts - the header followed by the actual data in the packet. The header consists of the destination IP address, i.e. where the request is heading. One method that can be used to block access to web sites is to create a blacklist of IP addresses that are inaccessible. So you'd load up all the IP addresses of web sites you want to block and instruct your routers to drop all packets destined for these addresses.
It sounds as though this would work well, however in practice it is not a great solution. For instance manually maintaining this list is difficult, many popular web sites have hundreds of IP addresses which change frequently - keeping track of these can be very difficult. Conversely many IP addresses are shared by many web sites, so you can end up blocking all of the other sites as well.
It can also have an effect on other service other than web sites, for example you can cause problems with delivering email if the IP addresses of mail servers are inadvertently blocked. Although this can be fixed by only blocking access to specific ports (such as usual Web port - HTTP 80)
Filtering TCP/IP Content
Rather than blocking communication on the basis of where our TCP/IP packets are heading, content filtering looks at the data in the packet not the header. It may be decided that blocking on the basis of keywords, or requests may be more desirable. This method can be more finely controlled however there is a problem with implementation, new hardware will be required as a router would not normally look at the data in the packet merely the destination header.
But even this is not straight forward, the full communication will be split over many, many packets and such your keywords may be spread across multiple data packets. If you are going to use this method properly the whole communication stream much be reassembled this adds complexity and a big impact on speed.
Internet Filtering HTTP Proxy
A method that is very commonly used is to ensure users cannot directly connect to the internet. It's fairly straight forward you just redirect all access to the internet through a server known as a proxy server. This is what usually happens in most commercial and educational environments, users web browsers are directed to access web sites through the proxy server. Your browser sends the request for a web page to the proxy server who then fetches it for you. There are advantages to this method outside filtering, the proxy can cache commonly accessed web pages and speed up access from multiple users for instance. But the benefit in internet filtering is that all the request for web pages come through a single point the proxy server, so it's much easy to block content by filtering access to specific web sites and even pages.
The Proxy Wars
One of the reasons that internet filtering by using a HTTP proxy is not completely effective is ironically the other use of proxy servers. If your work HTTP proxy is happily blocking access to www.facebook.com, and your social life is suffering, or your farm needs tending then you can access these web sites via using an outside HTTP proxy. So instead of sending a request for the facebook page, you ask the external proxy to fetch it and hence bypassing the work proxies ban on facebook.
There are in fact some other methods of filtering internet content such as using DNS which I've covered in many other posts here but perhaps the most effective method you'll find now combine HHTP proxy filtering with TCP/IP content filtering. This dual method used by such companies such as Websense, combine hardware devices which sniff the content of a network for banned urls and words and a web proxy.
Increasingly just using a simple proxy to beat internet filtering isn't enough, most organisations are adopting some form of content filtering which look inside the
packets of data. Just directing your browsing through a proxy will have no affect whatsoever on this technology.
So how do you beat Internet Filtering ?
So what do you do if you're faced with some sophisticated content filtering system like websense or Cleanfeed? Well there is a single simple technology that can be used to bypass this filtering and that's encryption. You see if your entire data stream is encrypted, a content filter can't read any of it - no keywords, no urls, nothing - combine this encryption with a proxy network and you can beat internet filtering just about anywhere.
If you want to know a product that does this - then check out Identity Cloaker, it has the technology to encrypt and cloak your protection plus access to a large network of proxies all across the world - everywhere from the US, UK to a German proxy . You can use it for complete privacy, to bypass internet censorship or even just watch Hulu, BBC Iplayer or any media site you like as the proxy will protect you from the country restrictions. You can even run it from your USB key and take it wherever you go.