Digital Education Resources - Vanderbilt Libraries Digital Lab
This module shows how to load data from and save data to a local file or file in Google Drive as well as how to retrieve data on the web. It discusses how simple Python data structures like lists and dictionaries can be combined to create tabular data structures. It also introduces a workhorse two-dimensional data structure used in data science with Python: pandas DataFrames.
This module includes an optional lesson that introduces another commonly used Python multidimensional data structure: NumPy arrays.
Recommended reference: Python for Data Analysis, 2nd Edition by Wes McKinney via Vanderbilt’s catalog / direct link to O’Reilly (VUNet login required). Free online version of documentation via pandas website.
Total video time: 5h 25m for all parts, but most users will spend about 4h
Data from files estimated 55 minutes
Complex data structures and functions 30 minutes
Reading and writing CSV files 34 minutes
optional lesson on NumPy arrays 53 minutes
Pandas series and data frames 61 minutes but not all videos apply to all users
DataFrame manipulation 40 minutes
Rearranging and combining DataFrames 35 minutes
Revised 2021-01-31
If you have any questions about these lessons, please contact Steve Baskauf at steve.baskauf@vanderbilt.edu
Questions? Contact us
License: CC BY 4.0.
Credit: "Vanderbilt Libraries Digital Lab - www.library.vanderbilt.edu"