Iterating List in Python

Today we’ll learn how to Iterating List in Python using example. You can download Python for Mac, Linux & Windows from here.

Your program will end up with a list of file handles that will need to be processed. We can use a for loop to iterate through the characters in the input text in python. Here we can use a for loop over the args.file inputs, which will be open file handles:

for fh in args.file:
    # read each file

You can give whatever name you like to the variable you use in your for loop, but I think it’s very important to give it a semantically meaningful name. Here the variable name fh reminds me that this is an open file handle. You saw in chapter 5 how to manually open() and read() a file. Here fh is already open, so we can use it directly to read the contents.

There are many ways to read a file. The fh.read() method will give you the entire contents of the file in one go. If the file is large–if it exceeds the available memory on your machine–your program will crash. I would recommend, instead, that you use another for loop on the fh. Python will understand this to mean that you wish to read each line of the file handle, one at a time.

for fh in args.file: # ONE LOOP!
    for line in fh:  # TWO LOOPS!
        # process the line

That’s two levels of for loops, one for each file handle and then another for each line in each file handle. ONE LOOP! TWO LOOPS! I LOVE TO COUNT!

What you’re counting

The output for each file will be the number of lines, words, and bytes (like characters and whitespace), each of which is printed in a field eight characters wide, followed by a space and then the name of the file, which will be available to you via fh.name.

Let’s take a look at the output from the standard wc program on my system. Notice that when it’s run with just one argument, it produces counts only for that file:

$ wc fox.txt
       1       9      45 fox.txt

The fox.txt file is short enough that you could manually verify that it does in fact contain 1 line, 9 words, and 45 bytes, which includes all the characters, spaces, and the trailing newline (see figure 6.2).

Fig. 1.1

Figure 1.1 The fox.txt file contains 1 line of text, 9 words, and a total of 45 bytes.

When run with multiple files, the standard wc program also shows a “total” line:

$ wc fox.txt sonnet-29.txt
       1       9      45 fox.txt
      17     118     669 sonnet-29.txt
      18     127     714 total

We are going to emulate the behavior of this program. For each file, you will need to create variables to hold the numbers of lines, words, and bytes. For instance, if you use the for line in fh loop that I suggest, you will need to have a variable like num_lines to increment on each iteration.

That is, somewhere in your code you will need to set a variable to 0 and then, inside the for loop, make it go up by 1. The idiom in Python is to use the += operator to add some value on the right side to the variable on the left side (as shown in figure 6.3):

num_lines = 0
for line in fh:
    num_lines += 1
Fig 1.2

Figure 1.2 The += operator will add the value on the right to the variable on the left.

You will also need to count the number of words and bytes, so you’ll need similar num_words and num_bytes variables.

To get the words, we’ll use the str.split() method to break each line on spaces. You can then use the length of the resulting list as the number of words. For the number of bytes, you can use the len() (length) function on the line and add that to a num_bytes variable.

NOTE

Splitting the text on spaces doesn’t actually produce “words” because it won’t separate the punctuation, like commas and periods, from the letters, but it’s close enough for this program.

More Articles on Python

Written by

XR Developer responsible for end-to-end development of XR solutions spanning multiple domains, by using various XR and WebXR libraries.

Leave a Reply