DansGuardian, flow of events
How DansGuardian works
DansGuardian sits between the client browser and the proxy intercepting and modifying their communication. Squid listens on port 3128 and DansGuardian listens on port 8080. When a request comes in on port 8080 DansGuardian filters and passes on to squid on port 3128.
The intercepting can be done by simply pointing the browsers at port 8080, or by enabling transparent proxying in squid and redirecting outgoing traffic on the firewall on port 80 to port 8080 on localhost (presuming DansGuardian is on the firewall). Clearly port 443 will also need redirecting and also other common proxy ports will need to be blocked to prevent bypassing.
The flow of events are as follows:
1. Client makes request to DansGuardian. DansGuardian checks the request header for the following: username, malformed URL, source IP, URL, and POST status. The appropriate filters are applied (banned user, exception user, banned URL, exception URL, banned IP, exception IP). The size of the POST upload, if there, is checked. If all is OK, the request is passed on to squid which then fetches the file from the Internet.
2. Squid passes just the header of the request back to DansGuardian which then checks it for: MIME-type, content disposition (file name). DansGuardian then applies filters for banned MIME-types, and file extensions. If all is OK, it goes on to step 3.
3. Squid passes the document body back to DansGuardian which it then decompresses (if the originating web server has the gzip or deflate plugin) and produces two copies - one with HTML and whitespace removed and the original. The original is searched for PICS labelling then both are searched for phrases. Filtering is done on the PICS rating and what phrases are found in the page. If the page is non-text, PICS and phrase searching are not done. The banned phrase, exception phrase and weighted phrase lists as well as the pics file come into play during this section.
4. If all is well, DansGuardian passes back the header and body to the client browser.
At any stage, if the page or file should be blocked, the process is aborted and the user is returned an Access Denied message.
The order in which the different lists are checked are roughly* as follows:
* This was how it worked in version 2.4 but since then things have become more complex but basically it's easiest to pretend it works like this.
- blanket block**
- blacket ip block**
** At these points the greyurllist and greysitelist are also checked.
A more recent version can be found here: http://sf.net/docman/display_doc.php?docid=30290&group_id=131757
Page last modified: 23 October 2005 20:08:32