Tuesday, October 26, 2010

AWK and GREP are your friends!

I definitely plan to write up some lessons learned from the STP Conference, but I thought I would share this quick tidbit for performance testers.

Our website profile is constantly changing so we inspect our log files to determine the correct mix for our performance testing.  There are many ways to do this and you can get as complex as you want.  Typically we choose our peak traffic day.

You have an access log and you want to know ratio of the number of GETS to the number of POSTS.

This command should give you the total number of requests.

cat | awk '{ print $7 }' | wc

This command should give you the total number of GETS.

 cat |grep \"GET | awk '{ print $7 }' | wc

This command should give you the number of POSTS.

 cat |grep \"POST | awk '{ print $7 }' | wc 

You can substitute "less" for the "wc" and you will see on your screen all of the GET requests.

You can use multiple grep segments to further refine your data. 

cat |grep \"GET | grep " 200 " | awk '{ print $7 }' | wc 

So my specific test I am only interesting in GETS with response code of "200".  I could write these results to a file.

cat |grep \"GET | grep " 200 " | awk '{ print $7 }' | wc >> foo.txt 

Yes some of you already know this stuff, but my main point is that you can learn a ton about your site traffic simply by dissecting the access logs.  You can generate a flat file that can be used by your performance tool of choice to randomly send requests to your system.  You can write a simple shell script that can take any access log and generate a quick breakdown on the request profile.  This profile can be used to proportion your performance test traffic in a similar pattern as your current user base.


Go ahead and karate chop your access logs!

 

No comments: