Transparent Web Proxying with Cisco, Squid, and WCCP

Kerry Thompson, July 2010

Contents

      Introduction
      A Basic Network and Web Proxy
      Cisco Configuration
      Squid Configuration
      Linux Network Configuration
      Testing
      Closing Notes
      References

Introduction

There are a number of good reasons for companies to deploy proxies for user access to the Internet. Amongst these are
  • Monitoring of web sites and traffic volumes
  • Restricting web access - by user, web sites, time of day, etc.
  • Using caching to reduce traffic volumes
  • Managing bandwidth

There are also a number of challenges faced when implementing proxies. Probably the top one is the job of configuring all of the web browsers to use the proxy, and then comes the problem of what to do if the proxy fails.

This article proesents a solution of web proxying which is transparent to the end-user - it requires no browser configuration. It is also resilient to failure, in that if the proxy server fails then web access continues to be provided without disruption.

A Basic Network and Web proxy

In the network drawing below I show a basic network with access to the Internet, this is a very common configuration for small business networks.



Figure 1: A basic network and proxy

In larger security-conscious organisations it is necessary to protect the proxy server against attack and misuse. This is usually performed by connecting the proxy into a DMZ network as shown in the next drawing.



Figure 2: A basic network with DMZ-protected proxy

A common solution for transparent proxying is to have all outbound traffic pass through a server which will detect web access and redirect the request to an internal proxy. This has a number of problems - not least of which is that it can't support multiple proxies, and when the server fails then all web access fails along with it.

WCCP Overview

Most Cisco routers support a protocol called Web Cache Communication Protocol, or WCCP. This protocol is used by a proxy server, such as a LInux server running the Squid proxy, to tell the router that it is alive and ready to process web access requests. WCCP uses the UPD protocol on port 2048 - it is essentially a one-way communication from the proxy to the router.



Figure 3: WCCP between the proxy and router

WCCP has a number of advantages when used between a proxy and the gateway router.
  • You can have multiple proxy servers. In fact, you can have almost any number if your router is big enough to handle them. THis means for large organisation the load will be spread amongst them improving performance.
  • Access is resilient to failure. If a proxy fails, then the router will immediately start using another (if you've got more than one configured), otherwise it will stop using proxies and forward requests directly to the Internet. The router can also be configured to block Internet web access if there are no running proxies available.
  • Optimised hashing of URLs. When you have more than one proxy a user will request a web page that will then be cached by a proxy. The next time any user requests the same page, the router will send the request to the same proxy with the cached copy of the page.
One caveat here to note though : WCCP is patented by Cisco, and is generally only available on Cisco routers and some high-end Cisco switches. A few other vendors such as BlueCoat also support WCCP, but not many.

WCCP proxy traffic flows are a little bit unusual, and can be very confusing to begin with. The following drawing shows the main flows for a WCCP proxy:

Figure 4: WCCP traffic flows

Figure 4: WCCP traffic flows

There's some interesting things to note about the traffic flows here.

  • The Squid proxy sends a WCCP packet to the router every 10 seconds to tell the router that the proxy is alive and ready to receive web requests. You can now see here that it is easy to have multiple proxy servers that can work with the router.
  • When a client makes a request for an Internet web page, it sends it directly to the Internet via the outer, as shown in (1) above.
  • The router captures the request, encapsulates it in a GRE packet, and forwards it to the proxy as shown in (2) above.
  • The linux system un-encapsulates the GRE packet and sends the request to the Squid proxy by performing a Destination NAT operation on the packet - note that Squid now receives the original packet with its original source and destination IP addresses.
  • The Squid proxy now fetches the web page from the Internet server in the normal fashion shown in (3) above - it uses its own IP address as the source and the original destination IP address for the destination. Note that the router does not intercept and attempt to proxy this request.
  • Once Squid has downloaded the page, it saves the data in its own cache, then replies directly back to the client on the internal network. And this is the tricky thing right here - when Squid replies it uses the IP address of the Internet server as the source in the packet, and the client IP address as the destination, this is shown in (4) above.
So, while the client thinks it is interacting with the remote web server via the Internet router, in actual fact it is interacting with the Squid proxy which is caching pages behind the scenes. If another user on another client makes a request for the same page they go through the same flow, but because the page is cached there is no need for Squid to fetch the page from the Internet server again.

In the remainder of this paper I will briefly show the Cisco, Linux, and Squid configurations required to get this working.

Cisco Configuration

In this example, I will have 2 proxies configured on the internal network (192.168.1.0/24) with IP addresses of 192.168.1.252 and 192.168.1.253. The first step is to define an access list containing the addresses of the proxies, and assign this as the list of WCCP proxies:
access-list 10 permit 192.168.1.252
access-list 10 permit 192.168.1.253
ip wccp web-cache group-list 10
Next we define another access-list to define direct or WCCP-proxied internet access. The proxies on 192.168.1.252 & 253 are denied access to WCCP, all other hosts on 192.168.1.0/24 are proxied when going to port 80, all others are denied. Denial implies direct access to the remote web server.
access-list 120 remark ACL for WCCP proxy access
access-list 120 remark Squid proxies bypass WCCP
access-list 120 deny ip host 192.168.1.253 any
access-list 120 deny ip host 192.168.1.252 any
access-list 120 remark LAN clients proxy port 80 only
access-list 120 permit tcp 192.168.1.0 0.0.0.255 any eq 80
access-list 120 remark all others bypass WCCP
access-list 120 deny ip any any
!
! Assign ACL to WCCP
ip wccp web-cache redirect-list 120
Now set WCCP version 2:
ip wccp version 2
Verify the configuration - it should be active on version 2 with no caches connected until the Squid proxy is configured.
Router#sh ip wccp           
Global WCCP information:
    Router information:
        Router Identifier:                   -not yet determined-
        Protocol Version:                    2.0

    Service Identifier: web-cache
        Number of Service Group Clients:     0
        Number of Service Group Routers:     0
        Total Packets s/w Redirected:        0
          Process:                           0
          Fast:                              0
          CEF:                               0
        Redirect access-list:                120
        Total Packets Denied Redirect:       0
        Total Packets Unassigned:            0
        Group access-list:                   -none-
        Total Messages Denied to Group:      0
        Total Authentication failures:       0
        Total Bypassed Packets Received:     0

Router#
At this point, client browsers which are not configured to use the Squid proxy explicitly may not be able to reach Internet web sites if the Squid proxy is registered with the router. If this is an issue for the users then the best option to disable & enable WCCP proxying is to remove the configuration from the interface (Fastethernet/0 in this case):
int f0
!
no ip wccp web-cache redirect in
and to enable it:
int f0
!
ip wccp web-cache redirect in

Squid Configuration

Now we need to configure a Squid proxy on a Linux server. I won't cover the basic installation - just the configuration part, so I assume you know a little bit about configuring Squid. To start with, check that Squid is installed and is working as a proxy by setting it up in your browser and fetching a few web pages through it. First of all, check that your Squid has been built ready for WCCP proxying. Run squid -v and verify that the following options are included:
--enable-linux-netfilter
--enable-wccpv2
If those options aren't there then you'll have to download the squid source code and build it from scratch with these options included in the ./configure build command. Now to configure WCCP for your Squid proxy. In this example I add a new listening port (port 3127) to Squid for transparent proxying, leaving the default port of 3128 available for normal proxying. Add the following lines to /etc/squid/squid.conf:

# additional port for transparent proxy
http_port 3127 transparent

# WCCP Router IP
wccp2_router 192.168.1.254

# forwarding 1=gre 2=l2
wccp2_forwarding_method 1

# GRE return method gre|l2
wccp2_return_method 1

# Assignment method hash|mask
wccp2_assignment_method hash

# standard web cache, no auth
wccp2_service standard 0
Restart the Squid proxy once the changes have been made, and verify the following:
  • Squid is listening on port 3128 & serving normal proxy requests
  • Squid is listening on 3127
  • Check no errors in Squid logs
You can now go back to your Cisco router and check that the Squid proxy has registered with WCCP, with the show ip wccp command.

Linux Network Configuration

Now that Squid is working, we need to get requests redirected from the Cisco router to the proxy. This is done by the router encapsulating the request packet within a GRE packet, hten forwarding it to the IP address of the Squid proxy. On the router side, this is automatic. But we need to configure the Linux system to receive these GRE-encapsulated packets, un-encapsulate them, and forward them to the listening proxy. I'm using a RedHat Linux system here, so the configuration files are those used by RedHat. Create a new interface, gre0 for the GRE interface, create the file /etc/sysconfig/network-scripts/ifcfg-gre0 with the following contents:
DEVICE=gre0
TYPE=GRE
BOOTPROTO=none
MY_INNER_IPADDR=172.16.1.1
PEER_OUTER_IPADDR=192.168.1.254
PEER_INNER_IPADDR=172.16.1.2
NETMASK=255.255.255.252
ONBOOT=yes
IPV6INIT=no
USERCTL=no
Run "ifdown gre0" and "ifup gre0" to test it, then run "ifconfig gre0" and verify the IP addressing. Enable IP forwarding, disable route packet filters, configure DNAT in IPtables Run the following commands:
# bring up GRE interface
ifup gre0

# enable IP forwarding, disable route packet filters
# between interfaces
echo 1 > /proc/sys/net/ipv4/ip_forward
echo 0 > /proc/sys/net/ipv4/conf/default/rp_filter
echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter
echo 0 > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo 0 > /proc/sys/net/ipv4/conf/lo/rp_filter
echo 0 > /proc/sys/net/ipv4/conf/gre0/rp_filter

# The following line redirects all http packets which exit gre0
# to port 3127 on the local Squid server.
iptables -F -t nat
iptables -t nat -A PREROUTING -i gre0 -p tcp -m tcp --dport 80 \
    -j DNAT --to-destination 192.168.1.253:3127
You'll need to run these are system boot time, add the commands to the start section of the /etc/init.d/squid script.

Testing

tcpdump is your friend when testing this configuration. Check the flows in order shown in Figure 4 above and verify that each one works in order. Remember that the Squid proxy will use the IP address of the Internet web server when replying back to the client, so be aware of this. If your proxy is behind a firewall you will probably have to disable anti-spoofing mechanisms to allow the proxy to spoof the web server's IP address.

Most problems seem to occur in the Linux GRE & NAT configuration. And don't forget to check the Squid logs for errors.

Closing Notes

In this paper I've described a method of transparently caching web requests using a Squid proxy and WCCP-enabled Cisco router. As described in the introduction this solution can be used to implement security controls and bandwidth management without having to reconfigure client systems to explicitly use a proxy server.

References

Configuring Transparent Interception with Fedora Core Linux and WCCPv2 (Squid project)
Configuring Web Cache Services Using WCCP (Cisco.com)
Configure WCCP on your Cisco IOS router (TechRepublic)
WCCP (Wikipedia)
WCCP Enhancements (Cisco.com)