AWK Tutorial – Day 1

UNIX world is having a wonderful list of powerful scripting languages. In this series, we are going to learn the basics of the good old AWK Programming Language. Unlike Perl and Python, it is not very familiar among the many new users, thats why I selected this. Like any other language, AWK is also having simple and complex part. But this series will only cover the basics and help you get started.

AWK is a small C-style, data-driven programming language. It is very useful for pattern scanning and process the regularly formatted text, like search files for lines that contain certain patterns, when a line matches one of the patterns, performs specified actions on that line. AWK keeps processing input lines in this way until it reaches the end of the input file(s).

AWK was developed by Alfred V. Aho, Peter J. Weinberger and Brian W. Kernighan in the year 1977 (at that time I was a new born baby). Apart from the basic features of the language, many useful extensions (like network access from AWK) are added by Bell Lab, GNU Project and Others. In GNU/Linux system, AWK is available by default. All examples are tested with GNU AWK(GAWK) and NOT with AWK/NAWK.

We will start with some simple AWK one-liner examples:

ZERO: Simple Echo

$ echo “Hello World” | awk ‘{print}’

Try: Instead of echo, pipe the output of cat file.

ONE: Print it in opposite order

$ echo “5 10″ | awk ‘{print $2 ” and ” $1}’
10 and 5

By default, AWK represent each and every individual characters or string as a field and each field can be accessed using $ColNum variable. In ONE, we have only 2 fields, $1 is 5 and $2 is 10. $0 represent the whole input line.

Try: Add 5 fields and print only field $2 and $4

TWO: Print only File Size

Print only the file size column of the ls output:

$ ls -l | awk ‘{print $5}’

Try: Print only the month column of the ls output.
Try: Print filename and file size column of ls output.

THREE: List all users name

$ cat /etc/passwd | awk -F: ‘{print $1}’

$ awk -F: ‘{print $1}’ /etc/passwd

In passwd file values are separated by “:“. We indicate that using -F option.

Try: Print only column 2 and 4 from this string “1,2,3,4,5”

FOUR: Print all users name greater then 6 chars

$ awk -F: ‘length($1) > 6 {print $1}’ /etc/passwd

Try: Print all users name with length equal to 10
Try: Print all non-empty line from a file.

FIVE: Print root user details

awk -F: ‘$1 == “root” {print $0}’ /etc/passwd

Try: Use ‘!=’ operator.

SIX: Print all users name start with ‘a’

$ awk -F: ‘/^a/ {print $1}’ /etc/passwd

Special characters ‘^’ and ‘$’ represent the start and end of the line.

Try: Print all name start with ‘s’.
Try: Print all name end with ‘i’.

SEVEN: Print all users using Bash shell

awk -F: ‘$7 ~ /bash$/ {print $1 ” ” $7}’ /etc/passwd

The == and != will not work with regex.

Try: Using ‘!~’, print all users not using Bash shell.

EIGHT: Arithmetic

$ echo “4 2” | awk ‘{print $1 “+” $2 “=” $1+$2}’

$ echo “4 2” | awk ‘{print $1 “/” $2 “=” $1/$2}’

$ echo “5 2” | awk ‘{print $1 “%” $2 “=” $1%$2}’

$ echo “5 2” | awk ‘{print ($1+$2)-($1%$2)}’

Try: Use -, *, ^ Operators.

NINE: Formatted output

$ echo “1 2.30 8 15 ILUGC” | awk ‘{printf “%2d, %2.0f, %f, %o, %X, %20s.n”, $1, $2, $2, $3, $4, $5}’

Try: Use printf in any of above example and play.

TEN: Print the last field

$ awk ‘{print $NF}’ /etc/passwd

Irrespective of number of fields present in each line, this command will print the last field.

Next Session

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: