Digital Education Resources - Vanderbilt Libraries Digital Lab

Note: this is the fourth lesson in a beginner’s introduction to Python. For the whole schedule, see the Vanderbilt Python Working Group homepage

previous lesson on object-oriented programming in Python

The examples in this lesson can be run in a Google Colaboratory notebook. A Google account is required. Click on this link, then if necessary, click on “Open with Google Colaboratory”. From the file menu select `Save a copy in Drive....` That will create a copy of the notebook that you can run, edit, and save. You may have to enable popups in order for the copy to open in a new tab.

If you are interested in using Jupyter notebooks, the examples are available in this notebook.

The presentation for this lesson is here

Answers for last week’s challenge problem:

latte maker with scrolling text box

# Introduction to Data Structures

Python includes a variety of data structures. We will learn about the two most important one: lists and dictionaries. In this lesson, we will start with lists.

# Lists

A list is a sequence of objects. The objects may be the same or different, but often are the same. The order of the list is important and items can be referenced by their position in the list, numbered from zero.

A list is created by putting the sequence in square brackets, separated by commas. In the following example, a list is assigned to a variable:

``````basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
``````

To reference a particular item, write the variable name followed by square brackets containing the index (position) of the object in the sequence: `basket[2]`.

A slice of the list can be referenced using the following notation: `basket[1:4]`. Important note: in Python, when ranges are specified, for some reason, the last number in the range is one greater than the actual position in the range. So in this example, items 1 through 3 will be included. Since counting in Python is zero based, that means that the slice will contain the second through fourth items.

To determin the count of items in a list, use the `len()` function. In this example, it would be `len(basket)`, which would have a value of 5.

Try this

Predict what would happen, then run the following code:

``````basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
print(howMany)
print(lunch)
print(len(lunch))
``````

What is the difference between the last two things that were printed?

## Manipulating lists

To add an item to a list, use the `.append()` method. Here is an example:

``````basket.append('durian')
``````

Notice that there is no assignment with this method – you simply apply it and the list itself is changed.

A list can also be empty. You can create an empty list like this:

``````hungry = []
``````

You can then add items to the list using the `.append()` method.

To change an item in a list, just assign a new value to that item:

``````basket[1] = 'tangerine'
``````

To remove an item from the list using its value, use the `.remove()` method:

``````basket.remove('banana')
``````

You can also delete an item using its index number:

``````del basket[3]
``````

Two lists can be combined using the `+` operator.

Try this

Predict what would happen, then run this code:

``````basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
print(lunch)
``````

This is an optional section. If you don’t want to delve into this, you can skip it. But if you skip this section, at least be aware that making copies of lists (i.e. assigning a list to a different variable) is a gotcha and does NOT work in the same way as making a copy of a simple object like a string or number.

As with user-defined objects, lists are complex objects composed of other objects. As complex objects, assigning a list to another variable creates a reference from the new variable to the original one. It does NOT make a separate copy.

To actually make a copy of a list, use the `deepcopy()` function from the `copy` module. Try the following code and look carefully at the results:

``````import copy

oldName = "fred"
newName = oldName # make a copy of the old name by assigning it to a different variable
oldName = "joe" # change the original name to something else
print('old name:', oldName)
print('new name:', newName)

oldList = ['apple', 'banana', 'orange']
linkedList = oldList # assign the old list to a new list variable
linkedList[1] = 'durian' # change an item on the new list
print()
print('old list:', oldList)

oldList = ['apple', 'banana', 'orange']
copiedList = copy.deepcopy(oldList) # copy the old list to a new list variable
copiedList[1] = 'durian' # change an item on the new list
print()
print('old list:', oldList)
print('copied list:', copiedList)
``````

Notice that when a list is assigned to a new variable, changes made using the new list variable affect the old list variable. That doesn’t happen with simple objects like strings. But when a list is copied into a new variable using the `deepcopy()` function, changes made using the new list variable don’t affect the old list variable.

This will be true for all of the complex compound data structures that we will be working with from here on.

## Lists of lists

A list can contain any object, including other lists. In some programming languages, there are two-dimensional structures called arrays. To create an array-like structure in Python, make a list of lists. Here’s an example:

``````firstRow = [3, 5, 7, 9]
secondRow = [4, 11, -1, 5]
thirdRow = [-99, 0, 45, 0]
data = [firstRow, secondRow, thirdRow]
``````

An equivalent way to have created this list of lists would have been:

``````data = [[3, 5, 7, 9], [4, 11, -1, 5], [-99, 0, 45, 0]]
``````

To reference a list of lists, first reference the outer list position, then the inner position. For example, to refer to the first item in the third list, use `data[2][0]`.

Try this

Predict what would happen, then try:

``````data = [[3, 5, 7, 9], [4, 11, -1, 5], [-99, 0, 45, 0]]
print(data[2][0])
print(len(data))
print(data[1])
print(len(data[1]))
``````

Note: the `numpy` module extends Python’s capabilities by adding actual array objects that can be addressed in the notation that’s more typical in other programming languages (like `data[2,0]`). For more details, see this Software Carpentries lesson.

# String manipulations

## Escape sequences

Since some characters can’t be typed on some keyboards, we can include them in strings by using an escape sequence. In Python, the backslash character `\` is used to escape some characters that follow, i.e. to make them have a different meaning than if the `\` weren’t there. We have seen this before with the newline character (“hard return” character) that makes a string go to the next line. We write it as `\n`. Although this escape sequence is composed of two letters `\` and `n`, it represents a single character, the “newline” character.

A few other important escaped characters are:

``````\'  for a single quote
\"  for a double quote
\\  to print the actual backslash character
\t  for a tab character
``````

Here are a few examples you can try:

``````windowsPath = 'Use this path: c:\\users\\baskauf\\data.json'
print(windowsPath)
quote1 = "He said \"What's goin' on!\" to me."
print(quote1)
quote2 = 'He said "What\'s goin\' on!" to me.'
print(quote2)
print()
table = 'col1\tcol2\tcol3\napple\torange\tpear'
print(table)
``````

In Python 3, all strings are composed of Unicode characters. Unicode allows us to print characters outside of the Roman alphabet and typical ASCII characters. To represent a Unicode character, we can write the escape sequence `\u` (for Unicode), followed by the four character hexidecimal number for that character. For example, two write the character for the Euro symbol, use `\u20ac`. Here is an example you can try:

``````statement = "It costs \$25.00, but that's \u20ac21.82 !"
print(statement)
nobelPeacePrize = 'Dag Hammarskj\u00f6ld'
print(nobelPeacePrize)
box = '\u250e\u2512\n\u2516\u251a'
print(box)
``````

## Slicing and dicing strings

Retrieving parts of strings uses the same notation as lists. (You can essentially think of a string as a list of characters.) So to get a particular character:

``````nobelPeacePrize = 'Dag Hammarskj\u00f6ld'
print(nobelPeacePrize[2])
``````

and to get part of a string, use:

``````nobelPeacePrize = 'Dag Hammarskj\u00f6ld'
print(len(nobelPeacePrize))
print(nobelPeacePrize[12:15])
``````

Notice that escaped characters count as a single character even if we write them as an escape sequence using several characters.

## Useful string methods

Here are some of the most important methods for strings:

``````.split()  split a string into a list based on a separator. Splits by any whitespace if no argument.
.capitalize()  capitalize the first word
.title()  capitalize all words
.upper()  capitalize all letters
.lower()  turn all letters to lower case
.replace()  replace the first argument with the second
``````

To do more sophisticated things, you’ll need to learn to use regular expressions (beyond the scope of this lesson!).

Try this

Predict what will happen, then run the code.

``````play = 'the taming of the shrew'
shakespere = play.title()
wordList = play.split(' ')
shouting = play.upper()
silly = play.replace('shrew', 'Tyrannosaurus rex')
print('We went to see "' + shakespere + '".')
print('The third word in the phrase was "' + wordList[2] + '".')
print("Don't write your email subjects like this: " + shouting)
print('I wrote the modern version of "' + silly + '".')

``````

# Iterating using `for`

Python has several ways to control the flow through a script. We’ve already seen how `if...else...` can be used to make choices. Another very common task is to repeat some code multiple times. For example, suppose we want to do something with every item in a list. A list is iterable, meaning that you can step through the list and operate on each of the items in the sequence. Here’s an example:

``````basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
print('I ate one ' + fruit)
print("I'm full now!")
``````

Each time the script iterates to another item in the list, it repeats the indented code below the `for` statement and the value of the iterator (`fruit` in this case) changes to the next item. Strings are also iterable:

``````word = 'supercalifragilisticexpialidocious'
print('Spell it out!')
for letter in word:
print(letter)
print('That wore me out.')
``````

## Ranges

You can generate an iterable range of numbers using `range()`. The form of the numbers we use in `range()` is similar to the numbering in slices, although we separate them with commas. The first number is the starting number and the second number is one more than the ending number. An optional third number can specify the step (e.g. 2 would generate every second number). The step can also be negative.

We can use a `for` statement to iterate through a range. Here are examples:

``````for count in range(1,11):
print(count)
``````
``````print('Prepare to launch!')
for countDown in range(10,0,-1):
print(countDown)
print('Lift off!')
``````
``````cheer = ''
for skipper in range(2, 10, 2):
cheer = cheer + str(skipper) + ', '
cheer = cheer + 'who do we appreciate?'
print(cheer)
``````

Notice how we need to be careful that our second number goes one step beyond our intended range. Also notice in the last example that if we wanted to treat the integer that we generated as a string, we needed to convert it to a string using the `str()` function.

Ranges are often used to index list items when we want to iterate through a list, but have access to the index number. Here is an example:

``````basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
print("Here's a list of the fruit in the basket:")
print(str(fruitNumber+1) + ' ' + basket[fruitNumber])
print('You can see that there are ' + str(len(basket)) + ' fruits in the basket.')
``````

Notice several things:

1. Because the number of items in the list `len(basket)` (5) is one more than the index of the last item in the list `basket[4]`, the range covers the entire list, since ranges must end one number greater than the range you want.
2. I had to add 1 to the `fruitNumber` as it iterated because Python counts starting from zero and I wanted to start from one.
3. I had to use the `str()` function each time I wanted to concatenate one of the integer numbers to other strings.

Try this

Answers are at the bottom of the page

1. Here is a list of countries, their GDPs based on purchasing power parity, and their populations:

``````economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
``````

A. Print the list of data for Egypt.

B. Print the population of Qatar.

C. Print the names of the countries using a `for` loop.

D. Print the GDP per capita (GDP PPP divided by population) using a `for` loop that iterates over a range.

2. Here is a list of the days of the week:

``````days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
``````

A. Using the list of days, print `Monday` and `Friday`.

B. Using the list days and a `for` loop that iterates over a range, print the weekdays (Monday through Friday).

# Homework

The answers are at the end

1. In a famous story, the young mathematician Karl Gauss’s teacher assigned him the task of adding all of the numbers from 1 to 100, with the intention of keeping him busy for a while. It didn’t work because in a few moments, Gauss calculated the answer, 5050, using some clever thinking. However, if Gauss were in school now, he could just write a Python script to do the calculation. Write a script using `range()` to add all the numbers from 1 up to any number that you choose. Note: if you use the `input()` function to get the person’s number, you’ll need to use the `int()` function to turn the entered string into an integer number.

2. The game Yatzee involves rolling five dice and trying to get “poker hands” like three of a kind, a straight, etc. You can simulate the rolling of a die using a function from the `random` module:

``````import random as r
randomNumber = r.randrange(1, 7)
print(randomNumber)
``````

A. Simulate the rolling of five dice as follows:

• create an empty list
• run a `for` loop five times
• each loop, generate a random number and append it to the list
• print the list.

B. Put the code you just wrote into a function called `throwDice`. The function takes the number of dice to roll as a parameter and returns the list of numbers representing the dice rolls. (In this problem, the number of dice will always be 5, but you should make the function general.) In the main script, ask the users how many times to roll the dice and create a loop that rolls and prints each roll.

C. Create a function called `isYatzee`. That function should take the list of dice rolls as a parameter and return `True` or `False` depending on whether the throw was a Yatzee (all 5 dice the same) or not. Modify your main script so that the dice throw is only printed if it’s a Yatzee. Generally, how many times do you need to roll before you start getting Yatzees?

# Challenge problems

1. Dealing cards Here is some code that generates a deck of cards as a list:

``````import random

def makeDeck():
suits = ['hearts', 'spades', 'clubs', 'diamonds']
deck = []

# generate the deck of cards
for suit in suits:
deck.append('A ' + suit)
for num in range(2,11):
deck.append(str(num) + ' ' + suit)
deck.append('J ' + suit)
deck.append('Q ' + suit)
deck.append('K ' + suit)
return(deck)

newDeck = makeDeck()
print(newDeck)
print(random.choice(newDeck)) # select a single card at random and print it
random.shuffle(newDeck) # randomize all of the cards in the deck and replace them in the deck list
print(newDeck)
``````

The code after the `makeDeck()` function shows how the `choice()` function and the `.shuffle()` method can be used to randomize the cards in the deck.

a. Use this function to write a script that “deals” a five card poker hand by printing five random cards from the deck. Note that after each card is printed, it has to be removed from the deck so that when the next card is printed, there isn’t any chance that you’ll get the same one a second time.

b. Instead of just printing the five cards, use `.append()` to add them to another list called `hand`. Print the whole hand list.

c. Can you figure out how to check for any kind of poker hands? How about a flush?

2. a. Print the words of “Stopping by Woods on a Snowy Evening” in reverse order. You can get the poem as a string here. You will need to iterate using an index rather than iterating the words directly.

b. Concatenate all of the words with spaces between them. Can you put line breaks and stansas in what you think are the right places?

1.A.

``````economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
print(economicData[2])
``````

B.

``````economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
print(economicData[0][2])
``````

C.

``````economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
for country in economicData:
print(country[0])
``````

D.

``````economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
for country in economicData:
print(country[0], country[1]/country[2])
``````

2. A.

``````days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
print(days[1])
print(days[5])
``````

B.

``````days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
for day in range(1,6):
print(days[day])
``````

1.

``````enteredString = input("What's the upper number? ")
myNumber = int(enteredString)
sum = 0
for number in range(1, myNumber + 1): # don't forget to add one to the upper range
sum += number # this does the same thing as
# sum = sum + number
print(sum)
``````

2.A.

``````import random as r
throw = []
for roll in range(0,5):
die = r.randrange(1, 7)
throw.append(die)
print(throw)
``````

B.

``````import random as r

def throwDice(numberDice):
throw = []
for roll in range(0, numberDice):
die = r.randrange(1, 7)
throw.append(die)
return throw

# main script
rolls = int(input("How many times to roll? ")) # don't forget to turn the input string into a number
for roll in range(0,rolls):
print(throwDice(5))
``````

C.

``````import random as r

def throwDice(numberDice):
throw = []
for roll in range(0, numberDice):
die = r.randrange(1, 7)
throw.append(die)
return throw

# There are a number of ways to create the following function. This is only one.
def isYatzee(throw):
allSame = True
for die in throw:
if die != throw[0]:  # check each die against the first die
allSame = False  # if any die is different, they aren't all the same
return allSame           # allSame only remains True if no die is different

# main script
rolls = int(input("How many times to roll? ")) # don't forget to turn the input string into a number
for roll in range(0,rolls):
throw = throwDice(5)
if isYatzee(throw):
print(throw)
``````

next lesson on dictionaries and JSON

Revised 2019-09-20