Astronomy Pilot Data Carpentry Workshop

The Carpentries

Online

November 16-20, 2020

1:00 pm - 5:00 pm

Instructors: Allen Downey, Azalee Bostroem

Helpers: Rudy Montez, Erin Becker

Some adblockers block the registration window. If you do not see the registration box below, please check your adblocker settings.

General Information

Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who have little to no prior computational experience, and its lessons are domain specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

For more information on what we teach and why, please see our paper "Good Enough Practices for Scientific Computing".

The Astronomy Curriculum Development Committee (ACDC) is excited to announce a community test drive of the pilot Data Carpentry workshop for Astronomers. The astronomy-tailored curriculum is designed to provide astronomers with essential skills for data-intensive analysis and visualization. The curriculum focuses on building complex SQL queries using Astroquery, working with the retrieved data in Astropy Tables and Pandas DataFrames, storing the data locally for future use, and communicating the results with clear and compelling figures using Matplotlib.

This test drive is designed to vet and improve the curriculum before the lessons are prepared for wider distribution. In particular, we seek participation from folks in the Astronomy community at all stages of their education and careers. Participants are expected to have knowledge equivalent to the Software Carpentry Python Curriculum: the ability to write a function in Python, familiarity with Python built-in types such as lists and dictionaries, and the ability to navigate directories using the command line. In addition, we welcome participants who are familiar with the concepts presented and would like to provide comprehensive feedback on the lessons.

Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.

When: November 16-20, 2020. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). Participants are expected to have knowledge equivalent to the Software Carpentry Python Curriculum: the ability to write a function in Python, familiarity with Python built-in types such as lists and dictionaries, and the ability to navigate directories using the command line.

Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.

Contact: Please email downey@allendowney.com or abostroem@gmail.com for more information.

Roles: To learn more about the roles at the workshop (who will be doing what), refer to our Workshop FAQ.


Code of Conduct

Everyone who participates in Carpentries activities is required to conform to the Code of Conduct. This document also outlines how to report an incident if needed.


Surveys

Please be sure to complete these surveys before and after the workshop.

For question 3 "Which of the following workshops are you attending?" please choose "I don't know".

Pre-workshop Survey

Post-workshop Survey


Schedule

Each day we will strive for 2.5 hours of instruction broken into cycles of 25 minutes of instruction, 5 minutes written feedback, 5 minute break. We will then take a 15 minute break and then finish the day with a more intensive structured feedback session. As this is our first time presenting this material, the days listed below are only approximate.

Session Topic Instructor
Monday ADQL queries and Astropy Tables Allen
Monday Complex queries and coordinates Allen
Tuesday Plotting, Pandas, and filtering Azalee
Wednesday Advanced queries and transforms Allen
Thursday Uploading tables and ADQL joins Azalee
Thursday Selection with photometry data Allen
Friday Making compelling figures Azalee

Setup

Install the videoconferencing client

If you haven't used Zoom before, go to the official website to download and install the Zoom client for your computer.

If you have used it before, make sure you have the most recent client installed. In particular, make sure you have a version that allows you to self-select a breakout room.

Set up your workspace

Like other Carpentries workshops, you will be learning by "coding along" with the Instructors. To do this, you will need to have both a window running a Jupyter notebook and a window for the Zoom video conference client. In order to see both at once, we recommend using one of the following set up options:

This blog post includes detailed information on how to set up your screen to follow along during the workshop.

Jupyter Notebook

For this workshop, we encourage you to work in a Jupyter notebook. If you are not familiar with Jupyter, you can run a tutorial by clicking here. Then select “Try Classic Notebook”. It will open a notebook with instructions for getting started.

You will need to install Python, Jupyter, and some additional libraries. If you don’t already have Jupyter, we recommend installing Anaconda, which is a Python distribution that contains everything you need to run the workshop code. It is easy to install on Windows, Mac, and Linux, and because it does a user-level install, it will not interfere with other Python installations.

Information about installing Anaconda is here.

If you have the choice of Python 2 or 3, choose Python 3.

There are two ways to get the libraries you need:

Installing libraries in an existing environment is simpler, but if you use the same environment for many projects, it will get big, complicated, and prone to package conflicts.

Install libraries in an existing Conda environment

Most of the libraries we need can be installed using Conda, by running the following commands in a Terminal.

If you are on a Mac or Linux machine, you should be able to use any Terminal. If you are on Windows, you might have to use the Anaconda Prompt, which you can find under the Start menu.

conda install jupyter numpy scipy pandas matplotlib seaborn libopenblas
conda install -c conda-forge astropy astroquery astro-gala python-wget

In addition, there’s one library we can’t install with Conda, so we have to use pip:

pip install pyia

Create a new Conda environment

To create a new Conda environment, you’ll need to download an environment file from our repository. On Mac or Linux, you can download it using wget on the command line:

wget https://raw.githubusercontent.com/AllenDowney/AstronomicalData/main/environment.yml

Or you can download it using this link.

In a Terminal or Jupyter Prompt, make sure you are in folder where environment.yml is stores, and run:

conda env create -f environment.yml

Then, to activate the environment you just created, run:

conda activate AstronomicalData

Run Jupyter

Before you launch Jupyter, download this notebook, which contains code to test your environment.

Or you can use wget to download it on the command line, like this:

wget https://raw.githubusercontent.com/AllenDowney/AstronomicalData/main/test_setup.ipynb

To start Jupyter, run:

jupyter notebook

Jupyter should launch your default browser or open a tab in an existing browser window. If not, the Jupyter server should print a URL you can use. For example, when I launch Jupyter, I get

$ jupyter notebook
[I 10:03:20.115 NotebookApp] Serving notebooks from local directory: /home/username
[I 10:03:20.115 NotebookApp] 0 active kernels
[I 10:03:20.115 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
[I 10:03:20.115 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

In this case, the URL is http://localhost:8888.
When you start your server, you might get a different URL. Whatever it is, if you paste it into a browser, you should should see a home page with a list of directories.

Now open the notebook you downloaded and run the cells that contain import statements. If they work and you get no error messages, you are all set.

If you get error messages about missing packages, you can install the packages you need using Conda or pip.

At the end of the notebook, you’ll be asked to copy and paste a line of code from our Slack workspace to the Jupyter notebook and run it. The reason for this test is that some environments convert “straight” quotation marks to “smart” quotation marks, which has the effect of breaking Python code. If you encounter this problem, you might have to check your system settings to turn off this “feature”.

If you run into problems with these instructions, let us know and we will make corrections. Good luck!