Bash commands

↑ Up

The Unix programming paradigm is built around small programs that read and write plain text in simple, predictable ways. By chaining tools together with redirection and pipes, complex workflows become compact and expressive. This approach is powerful for quickly processing large text files like logs, or large text-based datasets, all right at the command line.

Many excellent and free Unix and bash tutorials are available online, such as this intro or this interactive game. I encourage you to read people’s scripts and solutions on StackExchange; one can only learn a new language by reading good examples!

A non-complete list of commands, plus redirection

Inspecting files: view the beginning or end of a file
```
less file
head -n 5 file
tail -f output.log
```

Regular expressions: search, filter, and edit text

grep -E 'error|fail' file
sed 's/:/ /g' file

Column-based processing: extract columns, compute values, filter lines

awk -F, '{print $1,$3,($1-$3)/$3*100}' file
awk -v cut=100 'BEGIN{sum=0; n=0}{if ($1>cut){sum+=$1; n++}}END{printf "%i %.1f\n",n,sum/n}' file

Redirection: chain smaller commands together, creating powerful tools on the fly

echo "REMARK Manually created $(date)" > 1mbd_hsd.pdb
grep -E "ATOM|CA" 1mbn.pdb | sed 's/HIS/HSD/g' >> 1mbd_hsd.pdb
python run_simulation.py | tee output.log
python run_simulation.py > output.log 2> error.log
python run_simulation.py &> output_error.log

Example 1: Summarize many simulation log files

Suppose each log contains a line like: Final energy: -1342.883 kcal/mol

Extract energies and sort simulations by stability:

grep "Final energy" run_*.log \
| sed 's/:/ /' \
| awk '{print $1, $3}' \
| sort -k2 -n

Example 2: Simple data analysis on output files

Given a file energy.dat:

step energy
-502.1
-531.8
-530.4
-529.7

Count entries below a cutoff and compute the average:

awk -v cutoff=-531 'NR>1 && $2<cutoff {sum+=$2; count++} END {printf "N=%d  mean=%.2f\n", count, sum/count}' energy.dat

Erik Nordquist

Bash commands

A non-complete list of commands, plus redirection

Example 1: Summarize many simulation log files

Example 2: Simple data analysis on output files