Digital Education Resources - Vanderbilt Libraries Digital Lab
Previous lesson: Reading and writing CSV files
Vectorized computing is introduced in this lesson by working with NumPy arrays. NumPy arrays are a data structure that consists of a single type of data and can be one, two, or many-dimensioned. The lesson shows how to specify single elements and slices of the array and how to carry out several simple types of calculations. Image manipulations are used as a practical example to illustrate these principles.
Learning objectives At the end of this lesson, the learner will:
array()
function to generate an ndarray from a list.where()
function.Total video time: 52m 45s
Lesson Jupyter notebook at GitHub
Recommended reference: Python for Data Analysis, 2nd Edition by Wes McKinney via Vanderbilt’s catalog / direct link to O’Reilly (VUNet login required)
Code example:
price_list = [3.89, 14.78, 20.01, 99.62, 0.47]
euro_conversion_factor = 0.93
# Use a for loop to multiply each item by the conversion factor
euro_list = []
for price in price_list:
euro_list.append(price * euro_conversion_factor)
print(euro_list)
Code example:
import numpy as np
price_list = [3.89, 14.78, 20.01, 99.62, 0.47]
euro_conversion_factor = 0.93
euro_array = price_array * euro_conversion_factor
print(euro_array)
The procedural approach requires the operation to be applied to each item one at a time.
The vectorized approach automatically applies the operation to all elements of the array.
Operations on NumPy arrays are carried out 10 to 100 times faster than on “normal” Python lists.
One dimension is like a list:
[ 3.89 14.78 20.01 99.62 0.47]
Two dimensions are like a table:
[[34506 35446 40190 43824 46456]
[45369 46894 43901 44870 45978]
[21554 28745 34369 43593 53982]]
More than two dimensions are possible
Code example:
annual_sales_cars = [34506, 35446, 40190, 43824, 46456]
annual_sales_trucks = [45369, 46894, 43901, 44870, 45978]
annual_sales_suvs = [21554, 28745, 34369, 43593, 53982]
annual_sales_array = np.array([annual_sales_cars, annual_sales_trucks, annual_sales_suvs])
print(annual_sales_array)
print(annual_sales_array.ndim)
print(annual_sales_array.shape)
print(annual_sales_array.dtype)
Image from “Python for data analysis: data wrangling with pandas, NumPy, and IPython” by Wes McKinney, Chapter 4
Example:
print(annual_sales_array)
print(annual_sales_array[0, 2])
print(annual_sales_array[0])
print(annual_sales_array[:, 2])
Code examples:
# Convert annual sales to daily sales
daily_sales_array = annual_sales_array/365
print(daily_sales_array)
assessment = np.where(trucks >= 45000, 'good year', 'bad year')
print(assessment)
Code example:
import skimage.data as data
import matplotlib.pyplot as plt
camera = data.camera()
print(type(camera)
print(camera)
plt.imshow(camera, cmap='gray')
plt.show();
Code examples:
# Negative
inverse_camera = 255 - camera
plt.imshow(inverse_camera, cmap='gray')
plt.show();
# Threshold (black and white)
threshold_camera = data.camera()
# Method 1: Set pixels less than mid-gray to black
threshold_camera[threshold_camera < 128] = 0
# Set remaining pixels to white
threshold_camera[threshold_camera > 0] = 255
plt.imshow(threshold_camera, cmap='gray')
plt.show();
# Method 2: Screen pixels using np.where()
threshold_camera = np.where(camera < 128, 0, 255)
plt.imshow(threshold_camera, cmap='gray')
plt.show();
Thanks to Sanjay Mishra for these examples of making simple manipulations to images using NumPy.
np.sum()
function, which take an array as its argument.Practice problem 1 (1m28s)
Practice problem 2 (2m28s)
Practice problem 3 (3m00s)
Practice problem 4 (7m01s)
Next lesson: pandas Series
Revised 2021-01-31
Questions? Contact us
License: CC BY 4.0.
Credit: "Vanderbilt Libraries Digital Lab - www.library.vanderbilt.edu"