Digital Education Resources - Vanderbilt Libraries Digital Lab

Note: this is the fourth lesson in a beginner’s introduction to Python. For the whole schedule, see the Vanderbilt Python Working Group homepage

previous lesson on object-oriented programming in Python

The examples in this lesson can be run in a Google Colaboratory notebook. A Google account is required. Click on this link, then if necessary, click on “Open with Google Colaboratory”. From the file menu select Save a copy in Drive.... That will create a copy of the notebook that you can run, edit, and save. You may have to enable popups in order for the copy to open in a new tab.

If you are interested in using Jupyter notebooks, the examples are available in this notebook.

The presentation for this lesson is here

Answers for last week’s challenge problem:

latte maker with scrolling text box

Introduction to Data Structures

Python includes a variety of data structures. We will learn about the two most important one: lists and dictionaries. In this lesson, we will start with lists.

Lists

A list is a sequence of objects. The objects may be the same or different, but often are the same. The order of the list is important and items can be referenced by their position in the list, numbered from zero.

A list is created by putting the sequence in square brackets, separated by commas. In the following example, a list is assigned to a variable:

basket = ['apple', 'orange', 'banana', 'lemon', 'lime']

To reference a particular item, write the variable name followed by square brackets containing the index (position) of the object in the sequence: basket[2].

A slice of the list can be referenced using the following notation: basket[1:4]. Important note: in Python, when ranges are specified, for some reason, the last number in the range is one greater than the actual position in the range. So in this example, items 1 through 3 will be included. Since counting in Python is zero based, that means that the slice will contain the second through fourth items.

To determin the count of items in a list, use the len() function. In this example, it would be len(basket), which would have a value of 5.

Try this

Predict what would happen, then run the following code:

basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
howMany = len(basket)
print(howMany)
lunch = basket[1:3]
print(lunch)
print(len(lunch))
print(basket[4])
print(basket[0:howMany])
print(basket[0])
print(basket[0:1])

What is the difference between the last two things that were printed?

Manipulating lists

To add an item to a list, use the .append() method. Here is an example:

basket.append('durian')

Notice that there is no assignment with this method – you simply apply it and the list itself is changed.

A list can also be empty. You can create an empty list like this:

hungry = []

You can then add items to the list using the .append() method.

To change an item in a list, just assign a new value to that item:

basket[1] = 'tangerine'

To remove an item from the list using its value, use the .remove() method:

basket.remove('banana')

You can also delete an item using its index number:

del basket[3]

Two lists can be combined using the + operator.

Try this

Predict what would happen, then run this code:

basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
print(basket)
basket[1] = 'tangerine'
print(basket)
basket.remove('banana')
print(basket)
basket.append('durian')
print(basket)
del basket[0]
print(basket)
lunchBag = ['sandwich', 'cookie']
lunch = lunchBag + basket
print(lunch)

Advanced topic: copying lists

This is an optional section. If you don’t want to delve into this, you can skip it. But if you skip this section, at least be aware that making copies of lists (i.e. assigning a list to a different variable) is a gotcha and does NOT work in the same way as making a copy of a simple object like a string or number.

As with user-defined objects, lists are complex objects composed of other objects. As complex objects, assigning a list to another variable creates a reference from the new variable to the original one. It does NOT make a separate copy.

To actually make a copy of a list, use the deepcopy() function from the copy module. Try the following code and look carefully at the results:

import copy

oldName = "fred"
newName = oldName # make a copy of the old name by assigning it to a different variable
oldName = "joe" # change the original name to something else
print('old name:', oldName)
print('new name:', newName)

oldList = ['apple', 'banana', 'orange']
linkedList = oldList # assign the old list to a new list variable
linkedList[1] = 'durian' # change an item on the new list
print()
print('old list:', oldList)
print('linked list:', linkedList)

oldList = ['apple', 'banana', 'orange']
copiedList = copy.deepcopy(oldList) # copy the old list to a new list variable
copiedList[1] = 'durian' # change an item on the new list
print()
print('old list:', oldList)
print('copied list:', copiedList)

Notice that when a list is assigned to a new variable, changes made using the new list variable affect the old list variable. That doesn’t happen with simple objects like strings. But when a list is copied into a new variable using the deepcopy() function, changes made using the new list variable don’t affect the old list variable.

This will be true for all of the complex compound data structures that we will be working with from here on.

Lists of lists

A list can contain any object, including other lists. In some programming languages, there are two-dimensional structures called arrays. To create an array-like structure in Python, make a list of lists. Here’s an example:

firstRow = [3, 5, 7, 9]
secondRow = [4, 11, -1, 5]
thirdRow = [-99, 0, 45, 0]
data = [firstRow, secondRow, thirdRow]

An equivalent way to have created this list of lists would have been:

data = [[3, 5, 7, 9], [4, 11, -1, 5], [-99, 0, 45, 0]]

To reference a list of lists, first reference the outer list position, then the inner position. For example, to refer to the first item in the third list, use data[2][0].

Try this

Predict what would happen, then try:

data = [[3, 5, 7, 9], [4, 11, -1, 5], [-99, 0, 45, 0]]
print(data[2][0])
print(len(data))
print(data[1])
print(len(data[1]))

Note: the numpy module extends Python’s capabilities by adding actual array objects that can be addressed in the notation that’s more typical in other programming languages (like data[2,0]). For more details, see this Software Carpentries lesson.

String manipulations

Escape sequences

Since some characters can’t be typed on some keyboards, we can include them in strings by using an escape sequence. In Python, the backslash character \ is used to escape some characters that follow, i.e. to make them have a different meaning than if the \ weren’t there. We have seen this before with the newline character (“hard return” character) that makes a string go to the next line. We write it as \n. Although this escape sequence is composed of two letters \ and n, it represents a single character, the “newline” character.

A few other important escaped characters are:

\'  for a single quote
\"  for a double quote
\\  to print the actual backslash character
\t  for a tab character

Here are a few examples you can try:

windowsPath = 'Use this path: c:\\users\\baskauf\\data.json'
print(windowsPath)
quote1 = "He said \"What's goin' on!\" to me."
print(quote1)
quote2 = 'He said "What\'s goin\' on!" to me.'
print(quote2)
print()
table = 'col1\tcol2\tcol3\napple\torange\tpear'
print(table)

In Python 3, all strings are composed of Unicode characters. Unicode allows us to print characters outside of the Roman alphabet and typical ASCII characters. To represent a Unicode character, we can write the escape sequence \u (for Unicode), followed by the four character hexidecimal number for that character. For example, two write the character for the Euro symbol, use \u20ac. Here is an example you can try:

statement = "It costs $25.00, but that's \u20ac21.82 !"
print(statement)
nobelPeacePrize = 'Dag Hammarskj\u00f6ld'
print(nobelPeacePrize)
box = '\u250e\u2512\n\u2516\u251a'
print(box)

Slicing and dicing strings

Retrieving parts of strings uses the same notation as lists. (You can essentially think of a string as a list of characters.) So to get a particular character:

nobelPeacePrize = 'Dag Hammarskj\u00f6ld'
print(nobelPeacePrize[2])

and to get part of a string, use:

nobelPeacePrize = 'Dag Hammarskj\u00f6ld'
print(len(nobelPeacePrize))
print(nobelPeacePrize[12:15])

Notice that escaped characters count as a single character even if we write them as an escape sequence using several characters.

Useful string methods

Here are some of the most important methods for strings:

.split()  split a string into a list based on a separator. Splits by any whitespace if no argument.
.capitalize()  capitalize the first word
.title()  capitalize all words
.upper()  capitalize all letters
.lower()  turn all letters to lower case
.replace()  replace the first argument with the second

To do more sophisticated things, you’ll need to learn to use regular expressions (beyond the scope of this lesson!).

Try this

Predict what will happen, then run the code.

play = 'the taming of the shrew'
shakespere = play.title()
wordList = play.split(' ')
shouting = play.upper()
silly = play.replace('shrew', 'Tyrannosaurus rex')
print('We went to see "' + shakespere + '".')
print('The third word in the phrase was "' + wordList[2] + '".')
print("Don't write your email subjects like this: " + shouting)
print('I wrote the modern version of "' + silly + '".')

Iterating using for

Python has several ways to control the flow through a script. We’ve already seen how if...else... can be used to make choices. Another very common task is to repeat some code multiple times. For example, suppose we want to do something with every item in a list. A list is iterable, meaning that you can step through the list and operate on each of the items in the sequence. Here’s an example:

basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
for fruit in basket:
    print('I ate one ' + fruit)
print("I'm full now!")

Each time the script iterates to another item in the list, it repeats the indented code below the for statement and the value of the iterator (fruit in this case) changes to the next item. Strings are also iterable:

word = 'supercalifragilisticexpialidocious'
print('Spell it out!')
for letter in word:
    print(letter)
print('That wore me out.')

Ranges

You can generate an iterable range of numbers using range(). The form of the numbers we use in range() is similar to the numbering in slices, although we separate them with commas. The first number is the starting number and the second number is one more than the ending number. An optional third number can specify the step (e.g. 2 would generate every second number). The step can also be negative.

We can use a for statement to iterate through a range. Here are examples:

for count in range(1,11):
    print(count)
print('Prepare to launch!')
for countDown in range(10,0,-1):
    print(countDown)
print('Lift off!')
cheer = ''
for skipper in range(2, 10, 2):
    cheer = cheer + str(skipper) + ', '
cheer = cheer + 'who do we appreciate?'
print(cheer)

Notice how we need to be careful that our second number goes one step beyond our intended range. Also notice in the last example that if we wanted to treat the integer that we generated as a string, we needed to convert it to a string using the str() function.

Ranges are often used to index list items when we want to iterate through a list, but have access to the index number. Here is an example:

basket = ['apple', 'orange', 'banana', 'lemon', 'lime']
print("Here's a list of the fruit in the basket:")
for fruitNumber in range(0, len(basket)):
    print(str(fruitNumber+1) + ' ' + basket[fruitNumber])
print('You can see that there are ' + str(len(basket)) + ' fruits in the basket.')

Notice several things:

  1. Because the number of items in the list len(basket) (5) is one more than the index of the last item in the list basket[4], the range covers the entire list, since ranges must end one number greater than the range you want.
  2. I had to add 1 to the fruitNumber as it iterated because Python counts starting from zero and I wanted to start from one.
  3. I had to use the str() function each time I wanted to concatenate one of the integer numbers to other strings.

Try this

Answers are at the bottom of the page

1. Here is a list of countries, their GDPs based on purchasing power parity, and their populations:

economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]

A. Print the list of data for Egypt.

B. Print the population of Qatar.

C. Print the names of the countries using a for loop.

D. Print the GDP per capita (GDP PPP divided by population) using a for loop that iterates over a range.

2. Here is a list of the days of the week:

days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']

A. Using the list of days, print Monday and Friday.

B. Using the list days and a for loop that iterates over a range, print the weekdays (Monday through Friday).

Homework

The answers are at the end

1. In a famous story, the young mathematician Karl Gauss’s teacher assigned him the task of adding all of the numbers from 1 to 100, with the intention of keeping him busy for a while. It didn’t work because in a few moments, Gauss calculated the answer, 5050, using some clever thinking. However, if Gauss were in school now, he could just write a Python script to do the calculation. Write a script using range() to add all the numbers from 1 up to any number that you choose. Note: if you use the input() function to get the person’s number, you’ll need to use the int() function to turn the entered string into an integer number.

2. The game Yatzee involves rolling five dice and trying to get “poker hands” like three of a kind, a straight, etc. You can simulate the rolling of a die using a function from the random module:

import random as r
randomNumber = r.randrange(1, 7)
print(randomNumber)

A. Simulate the rolling of five dice as follows:

B. Put the code you just wrote into a function called throwDice. The function takes the number of dice to roll as a parameter and returns the list of numbers representing the dice rolls. (In this problem, the number of dice will always be 5, but you should make the function general.) In the main script, ask the users how many times to roll the dice and create a loop that rolls and prints each roll.

C. Create a function called isYatzee. That function should take the list of dice rolls as a parameter and return True or False depending on whether the throw was a Yatzee (all 5 dice the same) or not. Modify your main script so that the dice throw is only printed if it’s a Yatzee. Generally, how many times do you need to roll before you start getting Yatzees?

Challenge problems

1. Dealing cards Here is some code that generates a deck of cards as a list:

import random

def makeDeck():
    suits = ['hearts', 'spades', 'clubs', 'diamonds']
    deck = []
    
    # generate the deck of cards
    for suit in suits:
        deck.append('A ' + suit)
        for num in range(2,11):
            deck.append(str(num) + ' ' + suit)
        deck.append('J ' + suit)
        deck.append('Q ' + suit)
        deck.append('K ' + suit)
    return(deck)

newDeck = makeDeck()
print(newDeck)
print(random.choice(newDeck)) # select a single card at random and print it
random.shuffle(newDeck) # randomize all of the cards in the deck and replace them in the deck list
print(newDeck)

The code after the makeDeck() function shows how the choice() function and the .shuffle() method can be used to randomize the cards in the deck.

a. Use this function to write a script that “deals” a five card poker hand by printing five random cards from the deck. Note that after each card is printed, it has to be removed from the deck so that when the next card is printed, there isn’t any chance that you’ll get the same one a second time.

b. Instead of just printing the five cards, use .append() to add them to another list called hand. Print the whole hand list.

c. Can you figure out how to check for any kind of poker hands? How about a flush?

2. a. Print the words of “Stopping by Woods on a Snowy Evening” in reverse order. You can get the poem as a string here. You will need to iterate using an index rather than iterating the words directly.

b. Concatenate all of the words with spaces between them. Can you put line breaks and stansas in what you think are the right places?

Iterating: Try This answers

1.A.

economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
print(economicData[2])

B.

economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
print(economicData[0][2])

C.

economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
for country in economicData:
    print(country[0])

D.

economicData = [['Qatar', 357338000, 2569804], ['United States', 20412870000, 322179605], ['Egypt', 1292750000, 95688681], ['Haiti', 20794000, 10847334]]
for country in economicData:
    print(country[0], country[1]/country[2])

2. A.

days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
print(days[1])
print(days[5])

B.

days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
for day in range(1,6):
    print(days[day])

Homework answers

1.

enteredString = input("What's the upper number? ")
myNumber = int(enteredString)
sum = 0
for number in range(1, myNumber + 1): # don't forget to add one to the upper range
    sum += number # this does the same thing as
    # sum = sum + number
print(sum)

2.A.

import random as r
throw = []
for roll in range(0,5):
    die = r.randrange(1, 7)
    throw.append(die)
print(throw)

B.

import random as r

def throwDice(numberDice):
    throw = []
    for roll in range(0, numberDice):
        die = r.randrange(1, 7)
        throw.append(die)
    return throw

# main script
rolls = int(input("How many times to roll? ")) # don't forget to turn the input string into a number
for roll in range(0,rolls):
    print(throwDice(5))

C.

import random as r

def throwDice(numberDice):
    throw = []
    for roll in range(0, numberDice):
        die = r.randrange(1, 7)
        throw.append(die)
    return throw

# There are a number of ways to create the following function. This is only one.
def isYatzee(throw):
    allSame = True
    for die in throw:
        if die != throw[0]:  # check each die against the first die
            allSame = False  # if any die is different, they aren't all the same
    return allSame           # allSame only remains True if no die is different

# main script
rolls = int(input("How many times to roll? ")) # don't forget to turn the input string into a number
for roll in range(0,rolls):
    throw = throwDice(5)
    if isYatzee(throw):
        print(throw)

next lesson on dictionaries and JSON


Revised 2019-09-20

Questions? Contact us

License: CC BY 4.0.
Credit: "Vanderbilt Libraries Digital Lab - www.library.vanderbilt.edu"