6. String Traversal

A lot of computations involve processing a collection one item at a time. For strings this means that we would like to process one character at a time. Often we start at the beginning, select each character in turn, do something to it, and continue until the end. This pattern of processing is called a traversal. In this lesson we’ll look at a few different ways to traverse a string: two using the for loop and one using the while loop.

The for Loop: By Item

We have previously seen that the for statement can iterate over the items of a sequence (a list of names in the case below).

Recall that the loop variable takes on each value in the sequence of names. The body is executed once for each name. The same was true for the sequence of integers created by the range function.

Since a string is simply a sequence of characters, the for loop iterates over each character automatically.

The loop variable char is automatically reassigned each character in the string “Go Spot Go”. We will refer to this type of sequence iteration as iteration by item. Note that it is only possible to process the characters one at a time from left to right.

The for Loop: By Index

It is also possible to use the range function to systematically generate the indices (plural of index) of the characters. The for loop can then be used to iterate over these positions. These positions can be used together with the indexing operator to access the individual characters in the string.

Consider the following codelens example.

(ch08_7)

The index positions in “apple” are 0,1,2,3 and 4. This is exactly the same sequence of integers returned by range(5). The first time through the for loop, index will be 0 and the “a” will be printed. Then, index will be reassigned to 1 and “p” will be displayed. This will repeat for all the range values up to but not including 5. Since “e” has index 4, this will be exactly right to show all of the characters.

In order to make the iteration more general, we can use the len function to provide the bound for range. This is a very common pattern for traversing any sequence by position.Make sure you understand why the range function behaves correctly when using len of the string as its parameter value.

You may also note that iteration by position allows the programmer to control the direction of the traversal by changing the sequence of index values. Recall that we can create ranges that count down as well as up so the following code will print the characters from right to left.

(ch08_8)

Trace the values of index and satisfy yourself that they are correct. In particular, note the start and end of the range.

The while Loop: By Index

The while loop can also control the generation of the index values. Remember that the programmer is responsible for setting up the initial condition, making sure that the condition is correct, and making sure that something changes inside the body to guarantee that the condition will eventually fail and we avoid an infinite loop.

The loop condition is position < len(fruit), so when position is equal to the length of the string, the condition is false, and the body of the loop is not executed. The last character accessed is the one with the index len(fruit)-1, which is the last character in the string.

Here is the same example in codelens so that you can trace the values of the variables.

(ch08_7c1)

Check your understanding

    How many times is the word HELLO printed by the following statements?

    s = "python rocks"
    for ch in s:
        print("HELLO")
    
  • 10
  • Iteration by item will process once for each item in the sequence.
  • 11
  • The blank is part of the sequence.
  • 12
  • Yes, there are 12 characters, including the blank.
  • Error, the for statement needs to use the range function.
  • The for statement can iterate over a sequence item by item.

    How many times is the word HELLO printed by the following statements?

    s = "python rocks"
    for ch in s[3:8]:
        print("HELLO")
    
  • 4
  • Slice returns a sequence that can be iterated over.
  • 5
  • Yes, The blank is part of the sequence returned by slice
  • 6
  • Check the result of s[3:8]. It does not include the item at index 8.
  • Error, the for statement cannot use slice.
  • Slice returns a sequence.

    How many times is the letter “o” printed by the following statements?

    s = "python rocks"
    for index in range(len(s)):
        if index % 2 == 0:
            print(s[index])
    
  • 0
  • The for loop visits each index but the selection only prints some of them.
  • 1
  • o is at positions 4 and 8
  • 2
  • Yes, it will print all the characters in even index positions and the o character appears both times in an even location.
  • Error, the for statement cannot have an if inside.
  • The for statement can have any statements inside, including if as well as for.

    How many times is the letter “o” printed by the following statements?

    s = "python rocks"
    index = 1
    while index < len(s):
        print(s[index])
        index = index + 2
    
  • 0
  • Yes, index goes through the odd numbers starting at 1. o is at position 4 and 8.
  • 1
  • o is at positions 4 and 8. index starts at 1, not 0.
  • 2
  • There are 2 o characters but index does not take on the correct index values.