1.7: Viewing Files

Introduction

We know how to create directories and files, empty files at least !! using the mkdir and the touch commands respectively. In this chapter let us look into the ways to view files. Though viewing files seem like a simple task, as in double-clicking a file in Windows, Linux provides several commands to view the files.

Since we have not discussed about way to create our own files, we will use common text files available in the system, the files in /etc directory such as /etc/shells or /etc/passwd. Another good candidate is ~/.bash_history that may have around 500 lines.

Simplicity is one of the core philosophy of Unix-based systems. We have several simple commands, each catering to a specific way of viewing files instead of one complex command with multiple options to view files.

Command List: View Files

The following commands can be used to view files in different ways such as

  1. view entire file on the screen
  2. view some records from the start / end of the file
  3. view one page at a time with options to scroll back and forth
  4. view octal, ascii, binary and hexadecimal representations
  5. fold longer records and view
  6. view live files created by other processes without intruding the writing
  7. format and view files with page number, header, date

…and more

Here is the list of commands we will be discussing in this chapter.

# Name Description
1 cat view file(s), displayed on the screen
2 tac view file(s), records displayed in reverse order; last record first LIFO. May not be available in some OS versions
3 rev view records in reverse order, like cat but each record is displayed in reverse order
4 nl like cat, adds line number as prefix
5 wc display number of lines, words and characters of the file(s)
6 head display first 10 lines of file(s) by default
7 tail display last 10 lines of file(s) by default
8 less display contents of a file, one line at a time. use spacebar and b to page down and page up and q to quit display
9 more like the less command; older version and limited navigation features
10 od display files as ascii, octal, hexadecimal dump, useful in analysing binary files
11 fold wrap each record to fit a specific width
12 pr convert text file for printing with pagination and header

cat: view files

The cat command simply displays the contents of the file on the screen. It works fine to view small files and not a good choice to view large files. If we provide a wildcard instead of an actual file, cat simply displays all files as a single entity. The display looks like the files are concatenated and displayed.

Here is an example

# /etc/shells: supported shells 
$ cat /etc/shells
# List of acceptable shells for chpass(1).
# Ftpd will not allow users to connect who are not using
# one of these shells.

/bin/bash
/bin/csh
/bin/ksh
/bin/sh

Commonly used options

Option Description
-n display line number
-b display line number; ignore blank lines
-e show end of the line as $
$ cat -n /etc/shells
     1	# List of acceptable shells for chpass(1).
     2	# Ftpd will not allow users to connect who are not using
     3	# one of these shells.
     4	
     5	/bin/bash
     6	/bin/csh
     7	/bin/ksh
     8	/bin/sh

# display '$' as end-of-line, number non-empty lines
$ cat -b -e /etc/shells
     1	# List of acceptable shells for chpass(1).$
     2	# Ftpd will not allow users to connect who are not using$
     3	# one of these shells.$
$
     4	/bin/bash$
     5	/bin/csh$
     6	/bin/ksh$
     7	/bin/sh$

Sometimes, we may not be able to figure out what is wrong with the data due to non-printable characters since these characters are not visible (Tab, newline etc..). We can use the following options to view the non-printable characters that are usually displayed with a ^ or M- prefix.

Option Description
-v show non-printable characters except for TAB and newline
-T show tabs as ^I
-t same as -vT
-E show line endings as $
-e same as -vE
-A show all non-printable characters

The file non-printable-chars.txt has TAB, newline and other non-printable characters that are not visible by default.

$ cat non-printable-chars.txt
This is a standard text with alphabets and spaces
 is used to suspend a running process
The tab is used to display text in columnar format
For example
        1001    Smith   John    Male
        1002    Doe     Jane    Female

Non-printable characters

Let us use the -A to print all non-printable characters first. In the below demo

  1. ^Z: shows the CTRL-Z that was originally invisible (-v)
  2. $: displayed on behalf of newline (-E)
  3. ^I: displayed on behalf of tab (-T)
$ cat -A non-printable-chars.txt
This is a standard text with alphabets and spaces$
^Z is used to suspend a running process$
The tab is used to display text in columnar format$
For example$
^I1001^ISmith^IJohn^IMale$
^I1002^IDoe^IJane^IFemale$
$
Non-printable characters $

tac: view files in the reverse order

The tac command displays the file in reverse order, last record first and first record at the end. The command itself is derived by reverse spelling of cat. Yes, Unix has its own quirks and insider jokes !!

Example

# run cat on the same command for comparison
$ tac /etc/shells
/bin/sh
/bin/ksh
/bin/csh
/bin/bash

# one of these shells.
# Ftpd will not allow users to connect who are not using
# List of acceptable shells for chpass(1).

rev: reverse lines

The reverse command displays each record in the reverse order. It is like the tac command but instead of reversing records, it reverses characters in each record. In the upcoming chapters, we will discuss one of the powerful concepts in unix-like systems called redirection that can be used to combine tac and rev to reverse the entire file byte-by-byte.

$ rev /etc/shells
.)1(ssaphc rof sllehs elbatpecca fo tsiL #
gnisu ton era ohw tcennoc ot sresu wolla ton lliw dptF #
.sllehs eseht fo eno #

hsab/nib/
hsc/nib/
hsk/nib/
hs/nib/

nl: number the lines from files

The cat command has the -n and -b options to add line-numbers to each record. The nl command provides more options such as overriding the starting number and increment by 1. In addition to that, we can control the width and justification of the line numbers.

Option Description
-v N override starting number; defaults 1
-i N override increment, defaults 1
-w N width of the line numbers with leading spaces
-n OPT formatting options; ln: left-justified, rn: right justified
rz: right-justified with leading zeroes

Here is a demo of all options listed above.

$ nl -v 10 -i 5 -w 7 -n rz  /etc/shells
0000010 # List of acceptable shells for chpass(1).
0000015 # Ftpd will not allow users to connect who are not using
0000020 # one of these shells.

0000025 /bin/bash

wc : display lines, words and characters counts

The wc command displays the number of records, words and characters of a given file. In other words, it counts the number of newlines \n, space separated words and characters and display along with the file name provided as argument. This is a great way of getting an idea without even opening the file. By default, wc displays all three statistics.

NOTE: If you manually count the characters, you may get lesser number because wc includes the non-printable newline \n in the characters count.

Option Description
-l display line count ONLY
-w display words count ONLY
-c display characters count ONLY
# display lines, words and chars count of /etc/shells
$ wc /etc/shells
  8  12 134 /etc/shells

# `wc FILE` is same as 'wc -lwc FILE' or 'wc -l -w -c FILE'
$ wc -lwc /etc/shells
  8  12 134 /etc/shells

# display lines count 
$ wc -l /etc/shells
8 /etc/shells

# display chars count 
$ wc -c /etc/shells
134 /etc/shells

# display words and chars count 
$ wc -wc /etc/shells
 12 134 /etc/shells

head: display part of the records at the TOP

The head command displays the first 10 records from the given file. This is really helpful in viewing a sample of really big files.

Option Description
-C N display N characters
-N display first N lines
-n N same as -N

I have a names.txt file with line number and names for this demo that has 13 records.

$ head names.txt
  1. alan
  2. beth
  3. bob
...
...
  9. eion
 10. fred

Limit or extend the number of lines to be displayed

# override default MAX=10
$ head -n 3 names.txt
  1. alan
  2. beth
  3. bob
$ head -4 names.txt
  1. alan
  2. beth
  3. bob
  4. cathy

Limit characters instead of lines. This option is usually used with the /dev/urandom device file that generates random characters infinitely.

# the prompt '$' is displayed after '3.'
# since the -c option truncates at mid-line
$ head -c25 names.txt
  1. alan
  2. beth
  3. $

tail: display part of the records from the BOTTOM

The tail command works pretty much similar as the head, just at the end of the file. It has an additional couple of additional features.

We can watch a live file that is being updated by another process without interfering. This is very helpful to watch logs and files from long running processes to proactively take decisions based on the outcome rather than waiting for the process to finish. The file will be monitored until we terminate the watch using CTRL-C.

Another feature is to skip first N records, which is useful to skip header records while process a delimited file.

We can use all the options mentioned in the head and here are some additional options.

Option Description
-n +N display from the Nth line onwards
-f follow a live file, error out if file not exists
-F same as -f, waits for file to appear
--pid=PID used along with -f, terminates after PID is done

Display from 10th record

$ tail -n +10 names.txt
 10. fred
 11. freya
 12. felicity
 13. gaea

Live monitoring of file using -F or -f

$ ls
tr.txt   names.txt

# another process is periodically writing to 'tail.log'
# we can open this file and add text or write script to
# simulate the live file scenario

# tail -f FILE: fails if file not found
$ tail -f tail.log
tail: cannot open 'tail.log' for reading: No such file or directory
tail: no files remaining

# tail -F FILE: waits till the file is being created
#   if file being followed is renamed, 
#.  tail -F follows the new file
$ tail -F tail.log
tail: cannot open 'tail.log' for reading: No such file or directory
tail: 'tail.log' has appeared;  following new file
1
2
...
^C

less: display one page at a time

The less command can be used to view files, one screen at a time. It provides sub-commands to scroll-up, scroll-down page wise or line-by-line and to quit when we no longer want to continue. This is also called a pager command that is used in the man command as well as other manual or help commands such as perldoc, pydoc, …

We cannot document a demo on the less or more command. You are welcome to try the following sub-commands with man, more or less commands

Sub-commands

  • f : go to next page
  • spacebar : same as f, go to next page
  • b : go to previous page
  • ENTER : scroll one line down
  • DOWN arrow : scroll one line down
  • UP arrow : scroll one line up
  • h : list sub-commands and get help
  • q : quit less command
  • /SRCH : search forward text or pattern
  • ?SRCH : search backward text or pattern
  • n : find next search with /SRCH, previous search with ?SRCH
  • N : find previous search with /SRCH, next search with ?SRCH
  • &/SRCH : display only the lines that has SRCH text or pattern

more: display one page at a time

more is the oldest version of the less command. We can use the same sub-commands such f, b, q to move forward, backward and quit. This commands is listed for the sake of completeness. use the less command instead.

One notable change between less and more is that more automatically exits when we reach end of file whereas less needs an explicit q sub-command to quit browsing the file

od: dump file in octal, hex and ascii formats

The od command display the contents; individual characters or bytes of the file in octal, hexadecimal, ascii and other formats. By default, it takes 2 bytes at a time and display the result in octal format at a time.

The od command comes in handy understanding dump files or data file that contains garbled characters that is not explainable by viewing thru naked eye

For example, the word Hello_World will be displayed as 062510 066154 057557 067527 066162 000144. If we convert the octal number to hexadecimal, it would be 6548 6C6C 5F6F 6F57 6C72 64. The ASCII values for these hex numbers are eH ll _o oW lr d. If you swap each of these two bytes ASCII values, you will get He ll o_ Wo rl d => Hello_World. If you want to know why we need to swap, google Little Endian. Or better, forget you read this and use one of the options to get octal, hex or ascii dump.

The first word in each line of the display is the offset in octal format by default that starts with zero. We can use

Option Description
-a select named characters - newline and tab displayed as nl and ht
-c display printable chars and escape sequences with leading \
-b display in octal format
-x display in hexadecimal format
-Ad display offset in decimal format, default is octal
-Ax display offset in hexadecimal format, default is octal
-j N skip N bytes from the start
-N N limit to N bytes of display

Sample File

# we have a sample file with one record that has some
# ascii chars, tab and newline.
$ cat sample.txt
Hello   World

# Let us check the tab and newline first; ^I: TAB, $: newline
$ cat -A sample.txt
Hello^IWorld$

Octal Dump

# octal format: ASCII 0o110 is H, ...
$ od -b sample.txt
0000000 110 145 154 154 157 011 127 157 162 154 144 012
0000014

# hexadecimal format: little endian
$ od -x sample.txt
0000000 6548 6c6c 096f 6f57 6c72 0a64
0000014

# ascii with named characters: ht = TAB, nl = newline
$ od -a sample.txt
0000000   H   e   l   l   o  ht   W   o   r   l   d  nl
0000014

# ascii with escape sequences: \t, \n
$ od -c sample.txt
0000000   H   e   l   l   o  \t   W   o   r   l   d  \n
0000014

Offset Display format using `-A: Octal by default

# '-Ax' : offset in hexadecimal format - 00 0c
$ od -c -Ax sample.txt                                           
000000   H   e   l   l   o  \t   W   o   r   l   d  \n
00000c

# '-Ad' : offset in decimal format - 0 12
$ od -c -Ad sample.txt                                           
0000000   H   e   l   l   o  \t   W   o   r   l   d  \n
0000012

Skip -j N and Limit -N N dump

# limit to 5 bytes
$ od -N5 -a sample.txt
0000000   H   e   l   l   o
0000005

# skip 6 bytes
$ od -j6 -a sample.txt
0000006   W   o   r   l   d  nl
0000014

# skip 6 bytes, limit 5 bytes
$ od -j6 -N5 -a sample.txt
0000006   W   o   r   l   d
0000013

fold: wrap records to fit specific width

The fold command is used to limit the maximum number of characters displayed per line and if a record has more characters then the rest is wrapped to next line. Using this command, we can ensure that display fits within a viewable area of the screen and there is no need to scroll left and right.

By default, fold limits 80 characters per line. There are historical reasons for this. The terminals of the old days used to have 24x80 display; 24 rows and 80 columns. We can override this.

Option Description
-w N limit width of record to N bytes, default is 80
-s break at space if possible, not in the middle of the text

For this demo, i have created a file with one long record with 130 bytes and 10 words, each word of 12 bytes long delimited by spaces.

# single record - numbered
$ nl long_record.txt
1  A00000000001 A00000000002 A00000000003 A00000000004 A00000000005 A00000000006 A00000000007 A00000000008 A00000000009 A00000000010

# two recs split at 80 bytes, cut in the middle of A00000000007
$ fold  long_record.txt
A00000000001 A00000000002 A00000000003 A00000000004 A00000000005 A00000000006 A0     
0000000007 A00000000008 A00000000009 A00000000010

# 'fold -s': do not cut in the middle, use available spaces for split
$ fold -s long_record.txt
A00000000001 A00000000002 A00000000003 A00000000004 A00000000005 A00000000006
A00000000007 A00000000008 A00000000009 A00000000010 

# 'fold -w 26` : override default record width 80
$ fold -w26 long_record.txt
A00000000001 A00000000002
A00000000003 A00000000004
A00000000005 A00000000006
A00000000007 A00000000008
A00000000009 A00000000010 

pr: convert text files for printing

The pr command is used to convert files for printing. We can limit the number of lines per page and page width. By default, the command adds timestamp, file name and page number as header. We can override the date format, a title instead of file name using options

The print format file created by contains

  • 5 line header with header text
  • records formatted based on page width and lines per page
  • 5 line footer with blank lines
Option Description
+M start with page M
+M:N print from page M to N
-l N limit N lines per page
-w N limit page width by N characters
-h TEXT override file name with TEXT in header
-D FMT override date format in header
-t omit header and footer lines; overrides -h

Sample File

# 'pr -t names.txt': prints same output; 
$ cat names.txt
  1. alan
  2. beth
  3. bob
...
...
 11. freya
 12. felicity
 13. gaea

Create print format file

# pr:
#   -l 8  : 8 lines per page
#   -w 40 : 40 character page width
#   Date format: '%d %b %Y'; example 10 Jul 2021
#   Title: -h "Sample Names"
$ pr -l 15 -h "Sample Names" -D "%d %b %Y" -w 30 names.txt


13 Jul 2021 Sample Names Page 1


  1. alan
  2. beth
  3. bob
  4. cathy
  5. clive
...
...
13 Jul 2021 Sample Names Page 3


 11. freya
 12. felicity
 13. gaea

Summary

Though viewing files seem like a simple task as we used to click and open files on the windows environment, we have discussed several commands here that are tailor-made to view files in different way such as viewing sample, as reversed lines, reversed characters.. We will look into these commands again as these will be used in conjunction with other commands to perform tasks such as search, filter, replace and reformating text data and other file manipulation activities such as sort, split, cut, join etc..