Recently, I wanted to collect data on the Web pages being downloaded by users on one of our Linux machines. Standard Web access logs only record retrievals from a machine when it's acting as a Web server; I wanted details on downloads to the machine by resident Web browsers. I decided to write some packet monitoring software for the task, and filter out the web-related packets.
In the full article, I re-enact the stages in my coding, beginning with how I used the tcpdump monitoring utility, combined with tcpshow for displaying the results in a more readable form. I also wrote a simple filter for the output of tcpshow to further reduce the information, called web_collect.c
An example of their use:
$ tcpdump -lenx -s 100 'port 80 and src catsix' | tcpshow -cooked -data | web_collect | more tcpdump: listening on eth0 203.154.146.10.1747 -> 128.250.37.130.www over TCP GET /~swloke HTTP/1.0. ----------------------------------- 203.154.146.10.1749 -> 128.250.37.130.www over TCP GET /~swloke/ HTTP/1.0. ----------------------------------- 203.154.146.10.1751 -> 128.250.37.130.www over TCP GET / HTTP/1.0. ----------------------------------- 203.154.146.10.1753 -> 128.250.37.130.www over TCP GET /research/index.html HTTP/1.0. ----------------------------------- : :
Next I describe my own monitoring program, tcpmon.c, tuned for Linux and the retrieval of TCP packets (the Web protocol is implemented using TCP).
An example of its use:
$ tcpmon -p 80 | more Monitoring port 80 Promiscuous mode switched on --------------------------------------- catsix.coe.psu.ac.th [1755] -> munkora.cs.mu.OZ.AU [80] .... --------------------------------------- munkora.cs.mu.OZ.AU [80] -> catsix.coe.psu.ac.th [1755] .... --------------------------------------- catsix.coe.psu.ac.th [1755] -> munkora.cs.mu.OZ.AU [80] --------------------------------------- catsix.coe.psu.ac.th [1755] -> munkora.cs.mu.OZ.AU [80] GET /~swloke HTTP/1.0 . .Accept: model/vrml . .Accept: x-world/x-vrml : :
In the second version, cleverly called tcpmon2.c, I utilise the excellent libpcap library for extracting packets.
An example of its use:
$ tcpmon2 "port 80 and src fivedots" | more listening to eth0 filter: "port 80 and src fivedots" --------------------------------------- coe.psu.ac.th [13940] -> munkora.cs.mu.OZ.AU [80] .... --------------------------------------- coe.psu.ac.th [13940] -> munkora.cs.mu.OZ.AU [80] --------------------------------------- coe.psu.ac.th [13940] -> munkora.cs.mu.OZ.AU [80] GET /~swloke HTTP/1.0 . .Host: www.cs.mu.oz.au . .Accept: model/vrml, x-world/x-vrml, audio/x-pn-realaudio, application/ applefile, application/x-metamail-patch, sun-deskset-message, mail-file , default, postscript-file, audio-file, x-sun-attachment, text/enriched : :
I also briefly talk about promisc.c, a little program for detecting promiscuity (at least the Linux kind).
An example of its use:
$ promisc lo: Promiscuous mode is OFF eth0: Promiscuous mode is ON $ promisc -off turnoff flag set lo: Promiscuous mode is OFF eth0: Promiscuous mode is ON... switching off