AWK
Linux Terminal Util
AWK (GNU AWK)
AWK, or gawk (GNU AWK), is a powerful text-processing tool that can be used for text search, modification, and reporting tasks. It is widely used for extracting and manipulating data in files and streams.
Syntax
Options
-
-F FS
--field-separator FS
Use FS for the input field separator (the value of theFS
predefined variable). -
-f PROGRAM-FILE
--file PROGRAM-FILE
Read the awk program source from the filePROGRAM-FILE
, instead of from the first command line argument. -
-mf NNN
-mr NNN
These flags set the maximum number of fields (-mf
) and the maximum record size (-mr
). These are ignored bygawk
since it has no predefined limits. -
-v VAR=VAL
--assign VAR=VAL
Assign the variableVAR
the valueVAL
before program execution begins. -
-W traditional
--traditional
Use compatibility mode, wheregawk
extensions are turned off. -
-W lint
--lint
Issue warnings about dubious or non-portable awk constructs. -
-W lint-old
--lint-old
Warn about constructs not available in the original V.7 Unix version of awk. -
-W posix
--posix
Use POSIX compatibility mode, turning offgawk
extensions and applying additional restrictions. -
-W re-interval
--re-interval
Allow interval expressions in regular expressions. -
-W source=PROGRAM-TEXT
--source PROGRAM-TEXT
UsePROGRAM-TEXT
as the awk program source code, allowing mixing command-line source code with file-based code. -
--
This signals the end of options, ensuring that further arguments can be passed to the awk program itself.
Program Structure
An AWK program consists of patterns and actions. Each pattern triggers an action when matched.
For example, to display lines from a file that contain the string "123", "abc", or "some text":
Special Patterns
- /Regular Expression/
Matches any input record containing the specified text. - Pattern && Pattern
Logical AND between patterns. - Pattern || Pattern
Logical OR between patterns. - ! Pattern
Logical NOT for a pattern. - Pattern ? Pattern : Pattern
Conditional expression (if, then, else). - Pattern1, Pattern2
A range from Pattern1 to Pattern2. - BEGIN
Executes before the input file is read. - END
Executes after the input file is read.
Built-in Variables
- CONVFMT
Format used for converting numbers (default:%.6g
). - FS
Field separator for input (default is space or tab). - NF
Number of fields in the current record. - NR
Ordinal number of the current record. - FNR
Ordinal number of the current record in the current file. - FILENAME
Name of the current input file. - RS
Input record separator (default is newline). - OFS
Output field separator (default is space). - ORS
Output record separator (default is newline). - OFMT
Output format for numbers (default is%.6g
). - SUBSEP
Separator for multiple subscripts (default is034
). - ARGC
Argument count (assignable). - ARGV
Array of arguments, assignable; non-null members are taken as filenames. - ENVIRON
Array of environment variables.
Examples
- Print the fifth item from each line of an
ls -l
listing:
- Print the row number and the first item from each line:
- Print the first item and the third-last item from each line:
- Remove blank lines from a file:
- Print the length of the longest input line:
- Print seven random numbers from 0 to 100:
- Print the total number of bytes used by files:
- Print the average file size of all
.PNG
files in a directory:
- Count the lines in a file:
Comparison with grep
grep
searches files for lines that match a pattern, whereasawk
performs actions on matching lines.- Example: To search for the word "Dec" in a file:
- To search for the word "Dec" in the sixth field using
awk
: