UNIX & Linux Pipes & Utilities - tr, cut, sort, egrep

Creating UNIX & Linux Piped Commands

UNIX and Linux offer a very powerful feature know as pipes. It is often possible to carry out really complicated tasks without the need for shell scripting, variables, loops, etc.

Take the following example, which creates a dictionary file from a text document, but excludes one and two letter words:

tr -cs "[a-z][A-Z]" "[\012*]" < readme.txt | sort -uf | tr "[A-Z]" "[a-z]" | egrep -v "(^.$|^..$)" > dictionary

One command line that is comprised of five separate steps:

  • convert non letter characters from a file called readme.txt to newlines (stripping out the duplicate lines)
  • sort the resulting words
  • convert upper case letters to lower case letter
  • eliminate 1 and 2 character words
  • redirect the resulting output to a file called dictionary

A pipe allows the output from one command to be fed to another command as its input.

There is no limit to how many commands can be daisy chained together in this fashion (although a maximum command line length will eventually kick in!).

For more information on the tr command take a look here.

For more information on the sort command take a look here.

For more information on egrep command look here.

Of course a shells script could improve things in a couple of ways:

  • The complex command line can be hidden in a shell script with a simple one word script name typed to run it (although a shell alias could be used to do that too!)
  • A file name could be passed in to the shell script to create a dictionary file from any text file by specifying it on the command line when executing the script.
#!/bin/bash

#Shell Script - CreateDict

tr -cs "[a-z][A-Z]" "[\012*]" < $1 | sort -uf | tr "[A-Z]" "[a-z]" | egrep -v "(^.$|^..$)" > dictionary

 

The above script, called CreateDict can be executed with a single argument being passed in (the pathname of a text file to be used as the source for a dictionary file). The reference to $1 in the shell script picks up the first argument passed in to the shell script. We would run the script as follows:

$ CreateDict readme.txt

Of course this script could be improved upon enormously, but that is a story for another blog!

Would you like to know more about the utilities used in these examples, or more about scripting. Email your questions to info@ptr.co.uk or why not take a look at our UNIX training courses and Linux training courses.

 

Share this post