Tuples#

This chapter introduces one more built-in type, the tuple, and then shows how lists, dictionaries, and tuples work together. It also presents tuple assignment and a useful feature for functions with variable-length argument lists: the packing and unpacking operators.

In the exercises, we’ll use tuples, along with lists and dictionaries, to solve more word puzzles and implement efficient algorithms.

One note: There are two ways to pronounce “tuple”. Some people say “tuh-ple”, which rhymes with “supple”. But in the context of programming, most people say “too-ple”, which rhymes with “quadruple”.

Tuples are like lists#

A tuple is a sequence of values. The values can be any type, and they are indexed by integers, so tuples are a lot like lists. The important difference is that tuples are immutable.

To create a tuple, you can write a comma-separated list of values.

t = 'l', 'u', 'p', 'i', 'n'
type(t)
tuple

Although it is not necessary, it is common to enclose tuples in parentheses.

t = ('l', 'u', 'p', 'i', 'n')
type(t)
tuple

To create a tuple with a single element, you have to include a final comma.

t1 = 'p',
type(t1)
tuple

A single value in parentheses is not a tuple.

t2 = ('p')
type(t2)
str

Another way to create a tuple is the built-in function tuple. With no argument, it creates an empty tuple.

t = tuple()
t
()

If the argument is a sequence (string, list or tuple), the result is a tuple with the elements of the sequence.

t = tuple('lupin')
t
('l', 'u', 'p', 'i', 'n')

Because tuple is the name of a built-in function, you should avoid using it as a variable name.

Most list operators also work with tuples. For example, the bracket operator indexes an element.

t[0]
'l'

And the slice operator selects a range of elements.

t[1:3]
('u', 'p')

The + operator concatenates tuples.

tuple('lup') + ('i', 'n')
('l', 'u', 'p', 'i', 'n')

And the * operator duplicates a tuple a given number of times.

tuple('spam') * 2 
('s', 'p', 'a', 'm', 's', 'p', 'a', 'm')

The sorted function works with tuples – but the result is a list, not a tuple.

sorted(t)
['i', 'l', 'n', 'p', 'u']

The reversed function also works with tuples.

reversed(t)
<reversed at 0x7f56c0072110>

The result is a reversed object, which we can convert to a list or tuple.

tuple(reversed(t))
('n', 'i', 'p', 'u', 'l')

Based on the examples so far, it might seem like tuples are the same as lists.

But tuples are immutable#

If you try to modify a tuple with the bracket operator, you get a TypeError.

t[0] = 'L'
TypeError: 'tuple' object does not support item assignment

And tuples don’t have any of the methods that modify lists, like append and remove.

t.remove('l')
AttributeError: 'tuple' object has no attribute 'remove'

Recall that an “attribute” is a variable or method associated with an object – this error message means that tuples don’t have a method named remove.

Because tuples are immutable, they are hashable, which means they can be used as keys in a dictionary. For example, the following dictionary contains two tuples as keys that map to integers.

d = {}
d[1, 2] = 3
d[3, 4] = 7

We can look up a tuple in a dictionary like this:

d[1, 2]
3

Or if we have a variable that refers to a tuple, we can use it as a key.

t = (3, 4)
d[t]
7

Tuples can also appear as values in a dictionary.

t = tuple('abc')
s = [1, 2, 3]
d = {t: s}
d
{('a', 'b', 'c'): [1, 2, 3]}

Tuple assignment#

You can put a tuple of variables on the left side of an assignment, and a tuple of values on the right.

a, b = 1, 2

The values are assigned to the variables from left to right – in this example, a gets the value 1 and b gets the value 2. We can display the results like this:

a, b
(1, 2)

More generally, if the left side of an assignment is a tuple, the right side can be any kind of sequence – string, list or tuple. For example, to split an email address into a user name and a domain, you could write:

email = 'monty@python.org'
username, domain = email.split('@')

The return value from split is a list with two elements – the first element is assigned to username, the second to domain.

username, domain
('monty', 'python.org')

The number of variables on the left and the number of values on the right have to be the same – otherwise you get a ValueError.

a, b = 1, 2, 3
ValueError: too many values to unpack (expected 2)

Tuple assignment is useful if you want to swap the values of two variables. With conventional assignments, you have to use a temporary variable, like this:

temp = a
a = b
b = temp

That works, but with tuple assignment we can do the same thing without a temporary variable.

a, b = b, a

This works because all of the expressions on the right side are evaluated before any of the assignments.

We can also use tuple assignment in a for statement. For example, to loop through the items in a dictionary, we can use the items method.

d = {'one': 1, 'two': 2}

for item in d.items():
    key, value = item
    print(key, '->', value)
one -> 1
two -> 2

Each time through the loop, item is assigned a tuple that contains a key and the corresponding value.

We can write this loop more concisely, like this:

for key, value in d.items():
    print(key, '->', value)
one -> 1
two -> 2

Each time through the loop, a key and the corresponding value are assigned directly to key and value.

Tuples as return values#

Strictly speaking, a function can only return one value, but if the value is a tuple, the effect is the same as returning multiple values. For example, if you want to divide two integers and compute the quotient and remainder, it is inefficient to compute x//y and then x%y. It is better to compute them both at the same time.

The built-in function divmod takes two arguments and returns a tuple of two values, the quotient and remainder.

divmod(7, 3)
(2, 1)

We can use tuple assignment to store the elements of the tuple in two variables.

quotient, remainder = divmod(7, 3)
quotient
2
remainder
1

Here is an example of a function that returns a tuple.

def min_max(t):
    return min(t), max(t)

max and min are built-in functions that find the largest and smallest elements of a sequence. min_max computes both and returns a tuple of two values.

min_max([2, 4, 1, 3])
(1, 4)

We can assign the results to variables like this:

low, high = min_max([2, 4, 1, 3])
low, high
(1, 4)

Argument packing#

Functions can take a variable number of arguments. A parameter name that begins with the * operator packs arguments into a tuple. For example, the following function takes any number of arguments and computes their arithmetic mean – that is, their sum divided by the number of arguments.

def mean(*args):
    return sum(args) / len(args)

The parameter can have any name you like, but args is conventional. We can call the function like this:

mean(1, 2, 3)
2.0

If you have a sequence of values and you want to pass them to a function as multiple arguments, you can use the * operator to unpack the tuple. For example, divmod takes exactly two arguments – if you pass a tuple as a parameter, you get an error.

t = (7, 3)
divmod(t)
TypeError: divmod expected 2 arguments, got 1

Even though the tuple contains two elements, it counts as a single argument. But if you unpack the tuple, it is treated as two arguments.

divmod(*t)
(2, 1)

Packing and unpacking can be useful if you want to adapt the behavior of an existing function. For example, this function takes any number of arguments, removes the lowest and highest, and computes the mean of the rest.

def trimmed_mean(*args):
    low, high = min_max(args)
    trimmed = list(args)
    trimmed.remove(low)
    trimmed.remove(high)
    return mean(*trimmed)

First it uses min_max to find the lowest and highest elements. Then it converts args to a list so it can use the remove method. Finally it unpacks the list so the elements are passed to mean as separate arguments, rather than as a single list.

Here’s an example that shows the effect.

mean(1, 2, 3, 10)
4.0
trimmed_mean(1, 2, 3, 10)
2.5

This kind of “trimmed” mean is used in some sports with subjective judging – like diving and gymnastics – to reduce the effect of a judge whose score deviates from the others.

Zip#

Tuples are useful for looping through the elements of two sequences and performing operations on corresponding elements. For example, suppose two teams play a series of seven games, and we record their scores in two lists, one for each team.

scores1 = [1, 2, 4, 5, 1, 5, 2]
scores2 = [5, 5, 2, 2, 5, 2, 3]

Let’s see how many games each team won. We’ll use zip, which is a built-in function that takes two or more sequences and returns a zip object, so-called because it pairs up the elements of the sequences like the teeth of a zipper.

zip(scores1, scores2)
<zip at 0x7f3e9c74f0c0>

We can use the zip object to loop through the values in the sequences pairwise.

for pair in zip(scores1, scores2):
     print(pair)
(1, 5)
(2, 5)
(4, 2)
(5, 2)
(1, 5)
(5, 2)
(2, 3)

Each time through the loop, pair gets assigned a tuple of scores. So we can assign the scores to variables, and count the victories for the first team, like this:

wins = 0
for team1, team2 in zip(scores1, scores2):
    if team1 > team2:
        wins += 1
        
wins
3

Sadly, the first team won only three games and lost the series.

If you have two lists and you want a list of pairs, you can use zip and list.

t = list(zip(scores1, scores2))
t
[(1, 5), (2, 5), (4, 2), (5, 2), (1, 5), (5, 2), (2, 3)]

The result is a list of tuples, so we can get the result of the last game like this:

t[-1]
(2, 3)

If you have a list of keys and a list of values, you can use zip and dict to make a dictionary. For example, here’s how we can make a dictionary that maps from each letter to its position in the alphabet.

letters = 'abcdefghijklmnopqrstuvwxyz'
numbers = range(len(letters))
letter_map = dict(zip(letters, numbers))

Now we can look up a letter and get its index in the alphabet.

letter_map['a'], letter_map['z']
(0, 25)

In this mapping, the index of a is 0 and the index of z is 25.

If you need to loop through the elements of a sequence and their indices, you can use the built-in function enumerate.

enumerate('abc')
<enumerate at 0x7f3e9c620cc0>

The result is an enumerate object that loops through a sequence of pairs, where each pair contains an index (starting from 0) and an element from the given sequence.

for index, element in enumerate('abc'):
    print(index, element)
0 a
1 b
2 c

Comparing and Sorting#

The relational operators work with tuples and other sequences. For example, if you use the < operator with tuples, it starts by comparing the first element from each sequence. If they are equal, it goes on to the next pair of elements, and so on, until it finds a pair that differ.

(0, 1, 2) < (0, 3, 4)
True

Subsequent elements are not considered – even if they are really big.

(0, 1, 2000000) < (0, 3, 4)
True

This way of comparing tuples is useful for sorting a list of tuples, or finding the minimum or maximum. As an example, let’s find the most common letter in a word. In the previous chapter, we wrote value_counts, which takes a string and returns a dictionary that maps from each letter to the number of times it appears.

def value_counts(string):
    counter = {}
    for letter in string:
        if letter not in counter:
            counter[letter] = 1
        else:
            counter[letter] += 1
    return counter

Here is the result for the string banana.

counter = value_counts('banana')
counter
{'b': 1, 'a': 3, 'n': 2}

With only three items, we can easily see that the most frequent letter is a, which appears three times. But if there were more items, it would be useful to sort them automatically.

We can get the items from counter like this.

items = counter.items()
items
dict_items([('b', 1), ('a', 3), ('n', 2)])

The result is a dict_items object that behaves like a list of tuples, so we can sort it like this.

sorted(items)
[('a', 3), ('b', 1), ('n', 2)]

The default behavior is to use the first element from each tuple to sort the list, and use the second element to break ties.

However, to find the items with the highest counts, we want to use the second element to sort the list. We can do that by writing a function that takes a tuple and returns the second element.

def second_element(t):
    return t[1]

Then we can pass that function to sorted as an optional argument called key, which indicates that this function should be used to compute the sort key for each item.

sorted_items = sorted(items, key=second_element)
sorted_items
[('b', 1), ('n', 2), ('a', 3)]

The sort key determines the order of the items in the list. The letter with the lowest count appears first, and the letter with the highest count appears last. So we can find the most common letter like this.

sorted_items[-1]
('a', 3)

If we only want the maximum, we don’t have to sort the list. We can use max, which also takes key as an optional argument.

max(items, key=second_element)
('a', 3)

To find the letter with the lowest count, we could use min the same way.

Inverting a dictionary#

Suppose you want to invert a dictionary so you can look up a value and get the corresponding key. For example, if you have a word counter that maps from each word to the number of times it appears, you could make a dictionary that maps from integers to the words that appear that number of times.

But there’s a problem – the keys in a dictionary have to be unique, but the values don’t. For example, in a word counter, there could be many words with the same count.

So one way to invert a dictionary is to create a new dictionary where the values are lists of keys from the original. As an example, let’s count the letters in parrot.

d =  value_counts('parrot')
d
{'p': 1, 'a': 1, 'r': 2, 'o': 1, 't': 1}

If we invert this dictionary, the result should be {1: ['p', 'a', 'o', 't'], 2: ['r']}, which indicates that the letters that appear once are p, a, o, and t, and the letter than appears twice is 2.

The following function takes a dictionary and returns its inverse as a new dictionary.

def invert_dict(d):
    new = {}
    for key, value in d.items():
        if value not in new:
            new[value] = [key]
        else:
            new[value].append(key)
    return new

The for statement loops through the keys and values in d. If the value is not already in the new dictionary, it is added and associated with a list that contains a single element. Otherwise it is appended to the existing list.

We can test it like this:

invert_dict(d)
{1: ['p', 'a', 'o', 't'], 2: ['r']}

And we get the result we expected.

This is the first example we’ve seen where the values in the dictionary are lists. We will see more!

Debugging#

Lists, dictionaries and tuples are data structures. In this chapter we are starting to see compound data structures, like lists of tuples, or dictionaries that contain tuples as keys and lists as values. Compound data structures are useful, but they are prone to errors caused when a data structure has the wrong type, size, or structure. For example, if a function expects a list if integers and you give it a plain old integer (not in a list), it probably won’t work.

To help debug these kinds of errors, I wrote a module called structshape that provides a function, also called structshape, that takes any kind of data structure as an argument and returns a string that summarizes its structure. You can download it from https://raw.githubusercontent.com/AllenDowney/ThinkPython/v3/structshape.py.

We can import it like this.

from structshape import structshape

Here’s an example with a simple list.

t = [1, 2, 3]
structshape(t)
'list of 3 int'

Here’s a list of lists.

t2 = [[1,2], [3,4], [5,6]]
structshape(t2)
'list of 3 list of 2 int'

If the elements of the list are not the same type, structshape groups them by type.

t3 = [1, 2, 3, 4.0, '5', '6', [7], [8], 9]
structshape(t3)
'list of (3 int, float, 2 str, 2 list of int, int)'

Here’s a list of tuples.

s = 'abc'
lt = list(zip(t, s))
structshape(lt)
'list of 3 tuple of (int, str)'

And here’s a dictionary with three items that map integers to strings.

d = dict(lt) 
structshape(d)
'dict of 3 int->str'

If you are having trouble keeping track of your data structures, structshape can help.

Glossary#

pack: Collect multiple arguments into a tuple.

unpack: Treat a tuple (or other sequence) as multiple arguments.

zip object: The result of calling the built-in function zip, can be used to loop through a sequence of tuples.

enumerate object: The result of calling the built-in function enumerate, can be used to loop through a sequence of tuples.

sort key: A value, or function that computes a value, used to sort the elements of a collection.

data structure: A collection of values, organized to perform certain operations efficiently.

Exercises#

# This cell tells Jupyter to provide detailed debugging information
# when a runtime error occurs. Run it before working on the exercises.

%xmode Verbose

Ask a virtual assistant#

The exercises in this chapter might be more difficult than exercises in previous chapters, so I encourage you to get help from a virtual assistant. When you pose more difficult questions, you might find that the answers are not correct on the first attempt, so this is a chance to practice crafting good prompts and following up with good refinements.

One strategy you might consider is to break a big problems into pieces that can be solved with simple functions. Ask the virtual assistant to write the functions and test them. Then, once they are working, ask for a solution to the original problem.

For some of the exercises below, I make suggestions about which data structures and algorithms to use. You might find these suggestions useful when you work on the problems, but they are also good prompts to pass along to a virtual assistant.

Exercise#

In this chapter I said that tuples can be used as keys in dictionaries because they are hashable, and they are hashable because they are immutable. But that is not always true.

If a tuple contains a mutable value, like a list or a dictionary, the tuple is no longer hashable because it contains elements that are not hashable. As an example, here’s a tuple that contains two lists of integers.

list0 = [1, 2, 3]
list1 = [4, 5]

t = (list0, list1)
t
([1, 2, 3], [4, 5])

Write a line of code that appends the value 6 to the end of the second list in t. If you display t, the result should be ([1, 2, 3], [4, 5, 6]).

Try to create a dictionary that maps from t to a string, and confirm that you get a TypeError.

d = {t: 'this tuple contains two lists'}
UsageError: Cell magic `%%expect` not found.

For more on this topic, ask a virtual assistant, “Are Python tuples always hashable?”

Exercise#

In this chapter we made a dictionary that maps from each letter to its index in the alphabet.

letters = 'abcdefghijklmnopqrstuvwxyz'
numbers = range(len(letters))
letter_map = dict(zip(letters, numbers))

For example, the index of a is 0.

letter_map['a']
0

To go in the other direction, we can use list indexing. For example, the letter at index 1 is b.

letters[1]
'b'

We can use letter_map and letters to encode and decode words using a Caesar cipher.

A Caesar cipher is a weak form of encryption that involves shifting each letter by a fixed number of places in the alphabet, wrapping around to the beginning if necessary. For example, a shifted by 2 is c and z shifted by 1 is a.

Write a function called shift_word that takes as parameters a string and an integer, and returns a new string that contains the letters from the string shifted by the given number of places.

To test your function, confirm that “cheer” shifted by 7 is “jolly” and “melon” shifted by 16 is “cubed”.

Hints: Use the modulus operator to wrap around from z back to a. Loop through the letters of the word, shift each one, and append the result to a list of letters. Then use join to concatenate the letters into a string.

Exercise#

Write a function called most_frequent_letters that takes a string and prints the letters in decreasing order of frequency.

To get the items in decreasing order, you can use reversed along with sorted or you can pass reverse=True as a keyword parameter to sorted.

Exercise#

In a previous exercise, we tested whether two strings are anagrams by sorting the letters in both words and checking whether the sorted letters are the same. Now let’s make the problem a little more challenging.

We’ll write a program that takes a list of words and prints all the sets of words that are anagrams. Here is an example of what the output might look like:

['deltas', 'desalt', 'lasted', 'salted', 'slated', 'staled']
['retainers', 'ternaries']
['generating', 'greatening']
['resmelts', 'smelters', 'termless']

Hint: For each word in the word list, sort the letters and join them back into a string. Make a dictionary that maps from this sorted string to a list of words that are anagrams of it.

Exercise#

Write a function called word_distance that takes two words with the same length and returns the number of places where the two words differ.

Hint: Use zip to loop through the corresponding letters of the words.

Exercise#

“Metathesis” is the transposition of letters in a word. Two words form a “metathesis pair” if you can transform one into the other by swapping two letters, like converse and conserve. Write a program that finds all of the metathesis pairs in the word list.

Hint: The words in a metathesis pair must be anagrams of each other.

Credit: This exercise is inspired by an example at http://puzzlers.org.