2. What is AWK
● Programming language
● Created by Aho, Weinberger, and Kernighan
● Easy to treat csv-like data
Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec,Total
1951,,,,,,,1,,,1,,,2
1952,,,,,,1,1,1,,,,,3
1953,,,,,,1,,,1,,,,2
1954,,,,,,,,1,4,,,,5
1955,,,,,,,1,,1,2,,,4
1956,,,,1,,,,1,1,,,,3
1957,,,,,,,,,1,,,,1
1958,,,,,,,1,1,2,,,,4
1959,,,,,,,,2,1,1,,,4
3. Conciseness
● Print the lines whose second column is
greater than 10
– awk '$2>10'
● Erase blank lines
– awk 'NF'
● Extract second columns
– awk '$0=$2'
4. Program
● column : field
● row : record
● pattern { action }
● pattern: condition
– NF >= 10 && /^[0-9]*$/
● action: C-like expressions
– if, for, while, ...
5. Example 1
● $0: the record
● $n : n th field
● Problem: Print the lines whose second
column is greater than 10
– awk '$2>10'
– awk '$2>10{print $0}'
6. Example 2
● NF: Number of fields
● false: 0
● true: others
● Problem: Erase blank lines
– awk 'NF'
– awk ‘NF{print}'
– awk ‘NF{print $0}'
7. Example 3
● Problem: Print second field of each
record if it is not 0
– awk ‘$0=$2'
– awk ‘$2{$0=$2;print}'
– awk ‘$2{$0=$2;print $0}'
– awk ‘$2{print $2}'
8. Conclusions
● AWK is suitable for handling cvs
or tsv
● powerful C-like syntax
● Difficult to treat complex
structure
– Tree, queue, ...