Survey Data with Pandas#
This is the landing page for a tutorial at PyCon US 2025.
Links to the notebooks and setup instructions are below.
Abstract
Survey data analysis is a cornerstone of data science, whether you’re analyzing customer feedback, tracking election polls, or studying social trends. This tutorial introduces powerful tools from Pandas and StatsModels for extracting meaningful insights from survey data. Using real-world examples from the General Social Survey (GSS), we’ll explore how political beliefs have evolved in the United States over the past 50 years. Through hands-on exercises, you’ll master essential data science workflows: from data loading and validation to exploration, visualization, modeling, and effective communication of results.
Prerequisites
This tutorial is designed for Python users who are familiar with:
Basic Python programming
Fundamental data analysis concepts
Basic statistics
No prior experience with Pandas or survey data analysis is required.
Run the notebook#
You have two options to run the notebook:
Practice Version (Recommended for learning):
Solution Version (For reference):
Note: The notebook uses data from the General Social Survey (GSS), which will be automatically downloaded when you run the notebook. The GSS is a nationally representative survey of adults in the United States, conducted since 1972, making it an excellent resource for studying social trends and attitudes.
Running Locally#
If you prefer to run the notebooks on your local machine, follow these steps:
Clone the repository:
git clone https://github.com/AllenDowney/SurveyDataPandas.git cd SurveyDataPandas
Set up the environment:
# Create a conda environment make create_environment # Activate the environment conda activate SurveyDataPandas # Install required packages make requirements
If you use another environment manager, you can look in requirements.txt to see what packages you need.
Start Jupyter:
jupyter notebookOpen the notebook:
Navigate to the
notebooksdirectoryOpen
test_notebook.ipynb
If the code in test_notebook.ipynb runs with no errors, your setup is ready to go!