The Linux awk command is a powerful alternative to the cut command for slicing columnar data.
Advantages of awk
The cut command is one of my favorites, but here are some of the key advantages of awk:
Recognizes sequential while space (spaces and tabs) as a single field delimiter
Supports pattern matching
Supports complex programming logic
awk Syntax
The awk command is less of a command and more of a scripting language. An awk statement consists of two parts: a pattern or regular expression; and an action(s). The entire awk statement is contained within single quotes. The action is contained within curly-brackets. Here is the syntax:
awk 'pattern {action}' InputFile
Slicing Data
By default, awk parses each input line into fields using whitespace (spaces or tabs) as the delimiter. You can access each field by using the $ and field number. Fields begin at 1. You can access the entire line using $0.
Take the following file designated by the name sample.txt:
Field1-1 Field1-2 Field1-3
Field2-1 Field2-2 Field2-3
Field3-1 Field3-2 Field3-3
To use awk to extract the second field of each line:
$ awk '{print $2}' sample.txt
Field1-2
Field2-2
Field3-2
You can also add a pattern. Here the second field will only be printed if the first field is equal to Field2-1.
$ awk '$1=="Field2-1" {print $2}' sample.txt
Field2-2
If you prefer, you can use a regular expression (regex) as a pattern by encapsulating the expression in forward-slashes. Here the second filed will only be printed if the line begins with Field2-1.
$ awk '/^Field2-1/ {print $2}' sample.txt
Field2-2
You can also print the entire line by using $0.
$ awk '/^Field2-1/ {print $0}' sample.txt
Field2-1 Field2-2 Field2-3
You can use the -F option to specify a delimiter other than whitespace, such as a comma or semicolon.
Awk Quick Reference
Be sure to download our awk quick reference guide using the link below
Additional Resources
For more information on the awk command be sure to check out:
Comments