What is awk?
Awk is a pattern scanning processing language that was designed to process text files such as system dumps log files. Awk allows the use of regular expressions and pattern matching making it a very powerful language. The name Awk comes from its original authors: Alfred V. Aho, Brian W. Kernighan and Peter J. Weinberger. Gawk is the (Gnu Awk) implementation of Awk.
Like many programming languages, Awk can handle variables, loops, conditional processing and arithmetic. Below are some simple examples of using awk to display various fields of information. Below is intended as a quick introduction to awk only.
Basic Examples of Awk
Contents of test file text.txt
one two three four five six
John 246810 team01 UK Birmingham FTE
Paul 135790 team02 UK Glasgow FTE
Marcus 049583 team03 DE Bremen PTE
Foxy 903485 team01 UK Aston PTE
Print all Lines and fields in a file
Awk reads in each line of your file or input and separates each line into fields. By default, white space (spaces and tabs) are used to separate the fields. Each of this fields is then stored within variables. To display the entire line, we use the variable $0, for field one we would use $1, field two would be $2...
john@sles01:~/testing> awk '{ print $0 }' test.txt
one two three four five six
John 246810 team01 UK Birmingham FTE
Paul 135790 team02 UK Glasgow FTE
Marcus 049583 team03 DE Bremen PTE
Foxy 903485 team01 UK Aston PTE
Print field one
john@sles01:~/testing> awk '{ print $1 }' test.txt
one
John
Paul
Marcus
Foxy
Print field two
john@sles01:~/testing> awk '{ print $2 }' test.txt
two
246810
135790
049583
903485
Print fields one and two
john@sles01:~/testing> awk '{ print $1,$3 }' test.txt
one three
John team01
Paul team02
Marcus team03
Foxy team01
Print only fields containing a certain string
The following example prints only lines that contain the string "team01":
john@sles01:~/testing> awk '/team01/ { print $0}' test.txt
John 246810 team01 UK Birmingham FTE
Foxy 903485 team01 UK Aston PTE
The following example prints fields one, two and three only if they contain the string "team01":
john@sles01:~/testing> awk '/team01/ { print $1,"-",$2,"-",$3}' test.txt
John - 246810 - team01
Foxy - 903485 - team01
Again the following examples print only lines containing the specified string:
john@sles01:~/testing> awk '/FTE$/ { print $0 }' test.txt
John 246810 team01 UK Birmingham FTE
Paul 135790 team02 UK Glasgow FTE
john@sles01:~/testing> awk '/PTE$/ { print $0 }' test.txt
Marcus 049583 team03 DE Bremen PTE
Foxy 903485 team01 UK Aston PTE
Field Separator
By default awk splits its input line fields by white space. To modify this separator field you can use the "-F" flag to specify a different separator. A simple way to demonstrate this would be to process the "/etc/passwd" file as this is separated by ":" colons.
A simple example of this would be to issue the command: awk < /etc/passwd -F: '{ print $1,"-",$6,"-",$7 }'
We can now see that the name field, home directory area and shell information is displayed:
john - /home/john - /bin/bash
johnny - /home/johnny - /bin/bash
oracle - /home/oracle - /bin/bash
oralint - /home/oralint - /bin/bash
lol - /home/lol - /bin/bash
test - /home/test - /bin/bash
testuser - /home/testuser - /bin/bash
Basic Arithmetic with awk
The examples below show very basic addition, subtraction, multiplication and division calculations:
john@sles01:~/testing> echo 10 2 | awk '{ print $1 + $2 }'
12
john@sles01:~/testing> echo 10 2 | awk '{ print $1 - $2 }'
8
john@sles01:~/testing> echo 10 2 | awk '{ print $1 * $2 }'
20
john@sles01:~/testing> echo 10 2 | awk '{ print $1 / $2 }'
5
Basic Loop Example
There are many loop types that can be used by awk. Some of the commonly used types are "while", "do while" and "for". A simple example of a while loop can be found below:
john@sles01:~/testing> awk 'BEGIN{
x=1;
while(1)
{
print "Count = ",x;
if ( x==10 )
break;
x++;
}}'
Count = 1
Count = 2
Count = 3
Count = 4
Count = 5
Count = 6
Count = 7
Count = 8
Count = 9
Count = 10
Awk Scripting
Every awk program has three parts: a BEGIN block, which is executed once before any input is read; a main loop, which is executed for every line of input; and an END block, which is executed after all of the input is read.
A simple example of an awk script:
#!/usr/bin/awk -f
#
# Test awk script
#
BEGIN {
print "--- I am a test awk file ---"
count=0
}
{
if ($3 =="team01") {
print "team01 members found: "$1,"-",$3
count=count+1
}
}
END {
print "------------------------------"
printf("\tTotal Number of Records Processed:\t%d\n", NR)
printf("\tNumber of team01 members found :\t%d\n", count)
}
john@sles01:~/testing> ls -l awk01
-rwxr-xr-x 1 john users 416 May 28 15:02 awk01
john@sles01:~/testing> ./awk01 test.txt
--- I am a test awk file ---
team01 members found: John - team01
team01 members found: Foxy - team01
------------------------------
Total Number of Records Processed: 5
Number of team01 members found : 2
In the above example script our BEGIN block sets our count value to 0 "count=0", next the middle section of the script checks the contents of the third field $3. If this field contains our search criteria of "team01", then we increment the count value by one. Once each line of the input file has been scanned, the END block then prints a simple summary of the results. The variable count contains the number of matches and the awk variable NR contains the number of records processed.
No comments:
Post a Comment