2. Operations on Strings¶
In general, you cannot perform mathematical operations on strings, even if the strings look like numbers. The following are illegal (assuming that message
has type str
):
message - 1
"Hello" / 123
message * "Hello"
"15" + 2
Interestingly, the +
operator does work with strings, but for strings, the +
operator represents concatenation, not addition. Concatenation means joining the two operands by linking them end-to-end. For example:
The output of this program is banana nut bread
. The space before the word “nut” is part of the string and is necessary to produce the space between the concatenated strings. Take out the space at the beginning of baked_good
and run it again. You’ll see that the resulting output is banananut bread
.
The *
operator also works on strings. It performs repetition. For example, 'Fun'*3
is 'FunFunFun'
. One of the operands has to be a string and the other has to be an integer.
This interpretation of +
and *
makes sense by analogy with addition and multiplication. Just as 4*3
is equivalent to 4+4+4
, we expect "Go"*3
to be the same as "Go"+"Go"+"Go"
, and it is. Note also in the last example that the order of operations for *
and +
is the same as it was for arithmetic. The repetition is done before the concatenation. If you want to cause the concatenation to be done first, you will need to use parentheses.
The comparison operators also work on strings. To see if two strings are equal you simply write a boolean expression using the equality operator.
Other comparison operations are useful for putting words in lexicographical order. This is similar to the alphabetical order you would use with a dictionary, except that all the uppercase letters come before all the lowercase letters.
It is probably clear to you that the word “apple” would be less than (come before) the word “banana”. After all, “a” is before “b” in the alphabet. But what if we consider the words apple
and Apple
? Are they the same?
It turns out, as you recall from our discussion of variable names, that uppercase and lowercase letters are considered to be different from one another. The way the computer knows they are different is that each character is assigned a unique integer value. “A” is 65, “B” is 66, and “5” is 53. The way you can find out the so-called ordinal value for a given character is to using a string method called ord
.
When you compare one-character strings to one another, Python converts the characters into their equivalent ordinal values and compares the integers from left to right. As you can see from the example above, “a” is greater than “A” so “apple” is greater than “Apple”.
Humans commonly ignore capitalization when comparing two words. However, computers do not. A common way to address this issue is to convert strings to a standard format, such as all lowercase, before performing the comparison.
There is also a similar, but reverse, method called chr
that converts integers into their character equivalent.
One thing to note in the last two examples is the fact that the space character has an ordinal value (32). Even though you don’t see it, it is an actual character. We sometimes call it a nonprinting character.
Check your understanding
- python rocks
- Concatenation does not automatically add a space.
- python
- The expression s+t is evaluated first, then the resulting string is printed.
- pythonrocks
- Yes, the two strings are glued end to end.
- Error, you cannot add two strings together.
- The + operator has different meanings depending on the operands, in this case, two strings.
What is printed by the following statement?
s = "python"
t = "rocks"
print(s + t)
- python!!!
- Yes, repetition has precedence over concatenation
- python!python!python!
- Repetition is done first.
- pythonpythonpython!
- The repetition operator is working on the excl variable.
- Error, you cannot perform concatenation and repetition at the same time.
- The + and * operator are defined for strings as well as numbers.
What is printed by the following statement?
s = "python"
excl = "!"
print(s + excl * 3)
- True
- Both match up to the g but Dog is shorter than Doghouse so it comes first in the dictionary.
- False
- Strings are compared character by character.
Evaluate the following comparison:
"Dog" < "Doghouse"
- True
- d is greater than D according to the ord function (68 versus 100).
- False
- Yes, upper case is less than lower case according to the ordinal values of the characters.
- They are the same word
- Python is case sensitive meaning that upper case and lower case characters are different.
Evaluate the following comparison:
"dog" < "Dog"
- True
- d is greater than D.
- False
- The length does not matter. Lower case d is greater than upper case D.
Evaluate the following comparison:
"dog" < "Doghouse"