awk still rocks!

Last week I faced a unique challenging problem where I need to extract multiline soap request from a huge log file. First I thought about using java, then realised that loading a huge file is java could be problematic.  Then I thought about groovy or other interpreted language. Then lastly I found a very simple solution based on awk. I did some awk script in past, for generating huge test file based on some random criteria(Like generating CDR files for load testing). I was simply amazed to see how an awk script can solve this problem so easily.

Lets assume you have huge file with following SOAP request.

   Payload: <?xml version="1.0" encoding="UTF-8"?><vivek xmlns="com.test.vivek">
<Service>
	<Template>abc service template</Template>
	<XRef/>
	<LogMessage/>
	<CallBackUrl/>
	<Delivery>
		<Synchronous/>
	</Delivery>
</Service>
<Model>
	<Head>
		<From>test@test.com</From>
		<Subject>Technical Service &amp; Repair</Subject>
	</Head>
	<ContactUs>
		<Sender>
			<FirstName1>vivek</FirstName1>
			<FirstName2>vivek</FirstName2>
			<LastName1>kumar</LastName1>
			<LastName2>kumar</LastName2>
			<PhoneNumber>9674174848</PhoneNumber>
		</Sender>
	</ContactUs>
</Model>
</vivek>

Here is the the awk script to extract all such xml:

awk '/vivek/,/\/vivek>/'  test.txt   | awk '/Payload:/ {print "*****************"; print } !/Payload/{ print;} ' 

This awk script will extract all text from vivek to /vivek> then we run another awk script which check for work Payload and if its found then it print “***************” then rest of the line containing Payload text, if line does not contain Payload then it simply print the line. We are printing “**************” to keep a boundary between different matched request.

Here is the response from command:-

awk.png

Awk Reference

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s