awk command

 





📄 Sample Input File (Named: data.txt)

Alice 80 Math
Bob 70 English
Charlie 90 Science
David 65 History
Eve 85 Math



🧠 AWK Command Tutorial with Sample Inputs

Concept

Command / Syntax

Explanation

Output (on data.txt)

Basic Syntax

awk '{print}' data.txt

Prints all lines (default behavior)

Prints the entire file as-is

Print Specific Field

awk '{print $2}' data.txt

Prints the 2nd field

8070906585

Print Multiple Fields

awk '{print $1, $3}' data.txt

Prints name and subject

Alice MathBob English...

Field Separator

awk -F':' '{print $1}' /etc/passwd

Extracts usernames (only valid for : separated files)

Not applicable on data.txt

Condition-Based Match

awk '$2 > 80 {print $1, $2}' data.txt

If score > 80, print name and score

Charlie 90Eve 85

Pattern Matching

awk '/Math/ {print $0}' data.txt

Prints lines where subject is Math

Alice 80 MathEve 85 Math

BEGIN Block

awk 'BEGIN {print "Name Score Subject"} {print}' data.txt

Prints header first, then file lines

Name Score Subject+ all lines

END Block

awk '{print} END {print "Done"}' data.txt

Prints file lines, then “Done” at end

Prints lines + Done

Sum a Column

awk '{sum += $2} END {print "Total:", sum}' data.txt

Sums the 2nd field (scores)

Total: 390

Average of Column

awk '{sum += $2; n++} END {print "Avg:", sum/n}' data.txt

Calculates average score

Avg: 78

Formatted Output

awk '{printf "%-10s got %3d in %s\n", $1, $2, $3}' data.txt

Clean, aligned printing

Alice got 80 in Math...

Use System Variables

awk '{print NR, NF, $0}' data.txt

Shows line number (NR) and number of fields (NF)

1 3 Alice 80 Math...

Field Separator Output

`awk 'BEGIN {OFS="

"} {print $1, $2, $3}' data.txt`

Custom output separator

Delete Empty Lines

awk 'NF' data.txt

Skips/ignores empty lines

All lines with content (our sample has no empty lines)

Count Occurrences

awk '{count[$3]++} END {for (sub in count) print sub, count[sub]}' data.txt

Counts how many students in each subject

Math 2English 1Science 1History 1


Begin

The BEGIN block in awk is executed before any line of the input file is processed. It's perfect for initializing headers, setting variables, and formatting output.


✅ Recap: Basic BEGIN Block

awk 'BEGIN {print "Name Score Subject"} {print}' data.txt


Output:

Name Score Subject
Alice 80 Math
Bob 70 English
Charlie 90 Science
David 65 History
Eve 85 Math



🎯 FAANG-Style Scenario-Based Questions Using BEGIN


🔹 Q1: Print a report with a title, current timestamp, and then the data

awk 'BEGIN {
  print "STUDENT REPORT"
  print "Generated on: " strftime("%Y-%m-%d %H:%M:%S")
  print "-----------------------------------"
}
{print $1, $2, $3}' data.txt


🧠 Why?
BEGIN helps you insert metadata (like date/time) before processing file content.

Output Example:
STUDENT REPORT
Generated on: 2025-06-08 19:00:00
-----------------------------------
Alice 80 Math
Bob 70 English

...



🔹 Q2: Set a custom field separator before processing a CSV file

Input (students.csv):

Alice,80,Math
Bob,70,English
Charlie,90,Science


awk 'BEGIN {FS=","; OFS=" | "; print "Name | Score | Subject"} {print $1, $2, $3}' students.csv


🧠 Why?
Use BEGIN to define FS (Field Separator) and OFS (Output Field Separator) before data is parsed.

Output:

Name | Score | Subject
Alice | 80 | Math
Bob | 70 | English
Charlie | 90 | Science



🔹 Q3: Print average marks, but initialize sum/count in BEGIN

awk 'BEGIN {sum=0; count=0}
{
  sum += $2; count++
}
END {
  print "Average Marks:", sum/count
}' data.txt


🧠 Why?
BEGIN sets up initial values (sum=0, count=0) to avoid garbage memory or side effects.

Output:

Average Marks: 78



🔹 Q4: Print a table with headers, aligned columns using printf

awk 'BEGIN {
  printf "%-10s %-6s %-10s\n", "Name", "Score", "Subject"
  print "------------------------------------"
}
{
  printf "%-10s %-6s %-10s\n", $1, $2, $3
}' data.txt


🧠 Why?
BEGIN is great for printing headers and formatting once, while the main block prints rows.

Output:

Name       Score  Subject
------------------------------------
Alice      80     Math
Bob        70     English

...



🔹 Q5: Initialize a counter or hash map in BEGIN block

awk 'BEGIN {
  subjects["Math"]=0; subjects["English"]=0; subjects["Science"]=0; subjects["History"]=0
}
{
  subjects[$3]++
}
END {
  for (s in subjects) print s, subjects[s]
}' data.txt


🧠 Why?
Pre-load categories or counters in BEGIN for controlled processing or initializing associative arrays.

Output:

Math 2
English 1
Science 1
History 1



💡 BONUS Tips to Master BEGIN

🔍 Scenario

✅ When to Use BEGIN

Add a title/header row

Before printing actual data rows

Set field/output delimiters

When working with CSV or custom-separated input

Initialize sums/counters/arrays

When you want clean memory state before processing

Time-stamp reports

Use strftime() to inject current timestamp

Preprocess config/static variables

Store thresholds, field indexes, etc.



📘 What Are System Variables in AWK?

System (built-in) variables in awk give you metadata about each record/line, fields, files, formatting, and processing status. You can use them in any block: BEGIN, pattern/action, or END.


📊 Table of Popular awk System Variables

Variable

Meaning

Example Command

Explanation

NR

Number of Records (lines processed)

awk '{print NR, $0}' data.txt

Prints line number with each line

NF

Number of Fields in the current line

awk '{print NF, $0}' data.txt

Shows how many columns each line has

FS

Field Separator (input)

awk 'BEGIN {FS=","} {print $1}' file.csv

Sets field delimiter (default is space)

OFS

Output Field Separator

`awk 'BEGIN {OFS="

"} {print $1, $2}' data.txt`

RS

Record Separator (input)

awk 'BEGIN {RS=""} {print $1}' para.txt

Changes how lines/records are split (default is newline)

ORS

Output Record Separator

awk 'BEGIN {ORS="--"} {print $1}' data.txt

Changes how lines are printed (default is newline)

FILENAME

Name of the current input file

awk '{print FILENAME, $0}' data.txt

Useful when processing multiple files

ARGC

Number of command-line arguments

awk 'BEGIN {print ARGC}' file1

Helpful in scripting, shows how many args are passed

ARGV

Array of command-line arguments

awk 'BEGIN {for (i in ARGV) print ARGV[i]}'

Lists all input arguments

ENVIRON

Environment variables (as associative array)

awk 'BEGIN {print ENVIRON["HOME"]}'

Access shell environment variables from inside awk

CONVFMT

Conversion format for numbers

awk 'BEGIN {CONVFMT="%.2f"; x=5/3; print x}'

Sets default numeric format

OFMT

Format for output numbers

awk 'BEGIN {OFMT="%.1f"; print 2.357}'

Controls float formatting when printing numbers


🔍 Example Input File: data.txt

Alice 80 Math
Bob 70 English
Charlie 90 Science
David 65 History
Eve 85 Math



🧪 Practical Examples of System Variables


🔹 NR (Record Number)

awk '{print NR, $0}' data.txt


📌 Output:

1 Alice 80 Math

2 Bob 70 English

...



🔹 NF (Number of Fields)

awk '{print "Fields:", NF, "Line:", $0}' data.txt


📌 Useful to detect malformed lines or extra fields.


🔹 FS and OFS (Field Separators)

awk 'BEGIN{FS=","; OFS=" => "} {print $1, $2}' file.csv


If file.csv:

Alice,80

Bob,70


📌 Output:

Alice => 80

Bob => 70



🔹 FILENAME

awk '{print FILENAME, $0}' data.txt


📌 Output:

data.txt Alice 80 Math

...



🔹 RS (Record Separator)

Input: paragraph.txt

This is para one.


This is para two.


awk 'BEGIN{RS=""} {print "Block:", NR, $0}' paragraph.txt


📌 Treats blank line as paragraph separator.


🔹 ENVIRON

awk 'BEGIN {print "User:", ENVIRON["USER"]}' 


📌 Outputs shell environment variable from within awk.


🔹 ARGV & ARGC

awk 'BEGIN {for (i=0; i<ARGC; i++) print "Arg", i, ":", ARGV[i]}' data.txt


📌 Lists input file and any command-line arguments passed.


🔹 OFMT and CONVFMT (Format floats)

awk 'BEGIN {OFMT="%.2f"; print 3.14159}' 


📌 Output:

3.14



🎓 FAANG-Ready Questions Using System Variables


Q1: How do you detect lines with fewer than expected fields?

awk 'NF < 3 {print "Bad line:", $0}' data.txt


🧠 Checks malformed lines (e.g., corrupted data file).


Q2: Print line number, filename, and number of fields.

awk '{print NR, FILENAME, NF, $0}' data.txt



Q3: How to parse environment variables in awk?

awk 'BEGIN {print "Home Dir:", ENVIRON["HOME"]}'



Q4: Convert input records by paragraphs instead of lines.

awk 'BEGIN {RS=""} {print "Para:", NR; print $0}' para.txt



🧠 Hints and Memory Tricks

Mnemonic/Trick

Helps Remember

NR = Nth Record

Current line number

NF = Number Fields

Count of columns in line

FILENAME obvious

Current input file name

RS = Record Splitter

Changes how lines are broken (default: newline)

FS = Field Splitter

Changes how columns/fields are split (default: space/tab)

ENVIRON[] = env

Shell environment variables (like $HOME)


Would you like a Notion-based Flashcard Deck or a Cheat Sheet PDF for these system variables?




Distributed by Gooyaabi Templates | Designed by OddThemes