Network Management

Activate your FREE membership today  |  Log-in

  • Visit other TechTarget ANZ sites: 
Posted
Nov 20, 2009
 |  By:  Richard Bejtlich

Put user-agent strings to work as a network monitoring tool

Bookmark and Share

A user-agent string is an element of an HTTP header sent by HTTP clients such as Web browsers. The following HTTP request includes a user-agent string from a Windows XP SP3 system running Firefox, talking to a Squid proxy server.

GET
http://searchnetworkingchannel.techtarget.com/tips/index/0,289482,sid100_tax311687,00.html HTTP/1.1
Host: searchnetworkingchannel.techtarget.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer:
http://searchnetworkingchannel.techtarget.com/tip/0,289483,sid100_gci1369534,00.html
Cookie: BIGipServerlive=2768357386.41503.0000

The user-agent string displays a lot of interesting information that can be used to identify the version of the operating system and application that made the request.

Collecting user-agent strings
Network administrators can collect user-agent strings in two ways. The first is to extract them from proxy logs. For example, a Squid proxy log might contain an entry like the following:

1256175164.757 ::: 38 ::: 192.168.2.107 ::: TCP_MISS/302 ::: 748 ::: GET :::
http://media.techtarget.com/searchNetworkingChannel/images/spacer.gif ::: - ::: DIRECT/206.19.49.139 ::: text/html :::
"http://searchnetworkingchannel.techtarget.com/tips/index/0,289482,sid100_tax311687,00.html" ::: "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729)"
The triple colons ( ::: ) were added intentionally, for reasons that will appear next. The entry in the squid.conf file used to generate this log format is the following:

logformat squid-extended %ts.%03tu ::: %6tr ::: %>a ::: %Ss/%03Hs ::: %h" ::: "%{User-Agent}>h"

The second way to gather user-agent strings is to examine network traffic, perhaps using a tool like Httpry.

Once you have logs, what can you do with them? Consider the following command that examines Squid proxy logs, extracts the source IP addresses and user-agents, counts unique appearances, and sorts them.

cat /usr/local/squid/logs/access.log | awk 'FS=":::" {print $3 $12}' | sort -k 2 | uniq -c

Here we see the file separator (FS) is set to triple colons. In my experience, "traditional" file separators like commas or pipes appear too frequently in HTTP requests to be useful for logging, but you are free to use whatever file separator you would like. An excerpt of the output of running a command like this on a small live network appears next. I describe a few interesting elements of each after they are listed.

r200a:/root# cat /usr/local/squid/logs/access.log | awk 'FS=":::" {print $3 $12}' | sort -k 2 | uniq -c

1 103:::
14 192.168.2.104 "-"
7 192.168.2.104 "AVGINET8-WVSHX86 85FREE AVI=270.14.10/2429 BUILD=421 LOC=1033 LIC=8FREE--[...key obscured...] DIAG=90 OPF=0 PCA=" 6 192.168.2.104 "AVGINET8-WVSHX86 85FREE AVI=270.14.11/2430 BUILD=421 LOC=1033 LIC=8FREE--[...key obscured...] DIAG=90 OPF=0 PCA=" 10 192.168.2.104 "AVGINET8-WVSHX86 85FREE AVI=270.14.12/2431 BUILD=421 LOC=1033 LIC=8FREE--[...key obscured...] DIAG=90 OPF=0 PCA=" ...edited...

The three entries above show 192.168.2.104 has been updating its AVG antivirus product.

4 192.168.2.107 "AVGINET8-WXPPX86 85FREE AVI=270.14.25/2450 BUILD=423 LOC=1033 LIC=8FREE--[...key obscured...] DIAG=380 OPF=0 PCA="

Now we see a different PC running AVG. It has a different LIC (license) key. Google searches for both keys reveal they are not unique to these systems.

8 192.168.2.104 "Adobe Update Manager 6" 1 192.168.2.104 "Client" ...edited...

The Adobe program is interesting because it must have checked local proxy settings to do its update. The "Client" entry is extremely interesting because it appears only once.

We can search the proxy logs for that entry:

r200a:/root# grep \"Client /usr/local/squid/logs/access.log

1255628697.525 ::: 226 ::: 192.168.2.104 ::: TCP_MISS/204 ::: 406 ::: GET ::: http://g.microsoft.com/_0sfdata/1?CG=%7B2CEDBFBC-DBA8-43AA-B1FD-CC8E6316E3E2%7D&DV=8.0.6001.9&OS=6.0.6002.2.0&BL=en-us&AA=45:15:52:22&AB=556&AC=177&AD=23&AE=104&AF=50&AG=428&AH=0&AI=17&AJ=200&AK=0&AL=6&NR=4&BD=0&NE=0&IU=0&SD=0&NO=0&BS=0&OE=0&UA=0&TP=0&TC=0&TE=0&NP=215 ::: - ::: DIRECT/207.46.216.54 ::: - ::: "-" ::: "Client"

We can see the system accessed g.microsoft.com with IP address 207.46.216.54 (which belongs in Microsoft's 207.46.0.0/16 netblock). So this appears to be related to a Microsoft application.

16 192.168.2.104 "MSDW"

This entry is also obscure.

r200a:/root# grep \"MSDW /usr/local/squid/logs/access.log

1255516537.205 ::: 1185 ::: 192.168.2.104 ::: TCP_MISS/200 ::: 7640 ::: CONNECT ::: wer.microsoft.com:443 ::: - ::: DIRECT/65.55.53.156 ::: - ::: "-" ::: "MSDW"
1255517315.250 ::: 680 ::: 192.168.2.104 ::: TCP_MISS/200 ::: 7640 ::: CONNECT ::: wer.microsoft.com:443 ::: - ::: DIRECT/65.55.53.156 ::: - ::: "-" ::: "MSDW"
1255674327.981 ::: 210 ::: 192.168.2.104 ::: TCP_MISS/200 ::: 500 ::: GET :::
http://watson.microsoft.com/StageOne/Generic/MpTelemetry/80240016/BeginInstall/Install/1_1_1600_0/MpSigDwn_dll/1_1_1600_0/Windows%20Defender.htm?LCID=1033&OS=6.0.6002.2.00010300.2.0.3.18005&VID=1028&OEM=Dell&LOB=INS ::: - ::: DIRECT/65.55.53.190 ::: text/html ::: "-" ::: "MSDW" ...truncated...

Checking the logs, we see another Microsoft application, perhaps related to Dr. Watson and Windows Defender.

2 192.168.2.107 "Python-urllib/2.5"
6 192.168.2.108 "Python-urllib/2.6"

These Python entries are probably not caused by a Windows application. Checking the logs we see they are used by Ubuntu.

r200a:/root# grep \"Python-urllib /usr/local/squid/logs/access.log

1256133834.436 ::: 188 ::: 192.168.2.107 ::: TCP_MISS/304 ::: 275 ::: GET ::: http://changelogs.ubuntu.com/meta-release-lts ::: - ::: DIRECT/91.189.90.132 ::: - ::: "-" ::: "Python-urllib/2.5" 1256133856.075 ::: 206 ::: 192.168.2.107 ::: TCP_MISS/304 ::: 275 ::: GET ::: http://changelogs.ubuntu.com/meta-release-lts ::: - ::: DIRECT/91.189.90.132 ::: - ::: "-" ::: "Python-urllib/2.5" 1256173578.838 ::: 187 ::: 192.168.2.108 ::: TCP_MISS/304 ::: 345 ::: GET ::: http://changelogs.ubuntu.com/meta-release ::: - ::: DIRECT/91.189.90.132 ::: - ::: "-" ::: "Python-urllib/2.6" ...truncated...

As you can see, you can learn a lot about a network simply by looking at user-agent strings. The very simple network used to generate the logs for this story offered more than 60 different entries for analysis, but I displayed only nine for the sake of brevity. User-agent string mining can be used passively to identify and track applications and systems, for both inventory and security purposes. Consider ways you can use user-agent strings for network monitoring!



TechTarget ANZ sites: SearchCIO.com.au | SearchNetworking.com.au | SearchSecurity.com.au | SearchStorage.com.au | SearchVoIP.com.au

WF Online community sites: ElectricalSolutions | ElectronicsOnline | FoodProcessing | InMotionOnline | LabOnline | ProcessOnline | RadioComms | SafetySolutions | SustainabilityMatters | Voice&Data

Copyright © 2010 Westwick-Farrow Pty Ltd. All rights reserved.
About Us | Contact Us | TechTarget