Transportation Data Tools #1: OnTheMap

Finding reliable, robust, and interesting data is the first step to any data analysis project, so I thought I’d spent a few posts highlighting some of my favorite data sources out there. One tool I find very underrated in the transportation planning world is the US Census Bureau Center for Economic Studies’ Longitudinal Employer-Household Dynamics (LEHD) data and its visualization tool, OntheMap. It is best used for studying where people who live in a place are commuting to, or where people who commute to a place are living. Say you want to know where people in your city or neighborhood are commuting to, and how you can improve transportation options for them. You can load a custom geography of your city into the tool (using a shapefile, GPS coordinates, or a polygon drawn in Google MyMaps), make a few selections,

Source: author’s own screenshot, onthemap.ces.census.gov

and it quickly spits out a map and list of where people living in your city are commuting to. As in the example below, you can also do the reverse and see where people are commuting from. There are a few options to also filter workers by income, industry, and age.

Source: author’s own screenshot, onthemap.ces.census.gov

If you’re like me and want more control over how your maps look, you can then export the resulting data into your own GIS software to make it prettier and analyze further.

For the analysis shown here, I wanted to look at people who work downtown and earn very low wages (under $15,000 per year). These include restaurant and hotel workers, among others. Where do those people live? It was relevant because my agency’s commuter rail, Sounder, primarily serves people commuting to downtown Seattle, and I wanted to understand how well it is serving people with very low incomes. How many of them live within the areas from which Sounder riders typically commute (outlined in black)? It turns out, not very many. They tend to live in either South Seattle, or east or north of the city, outside the extent of this map. So Sounder won’t help them much, but we run other transportation services from those areas and now we have a better sense of some of those communities’ needs.

Note that In this case I had the tool organize workers’ home locations by ZIP code, but you can choose by census tract or larger areas as well, depending on how granular you want to get. You can also export an Excel file of the data, allowing for endless additional analyses.

If you’ve used this tool, I’d be curious to hear your thoughts as well!

Give Back by Contributing to an Open Source Project

One of the cool things about the data science/tech scene is that many innovations being developed right now are open source, meaning code is posted freely online so others can use it and benefit from it. This also means there are lots of chances to contribute to that code and be part of this work, even as someone new to coding. This can get you practice and help you meet people with similar interests.

There are lots of tutorials out there on how to use Github, like this one: https://buff.ly/3png6VD. These tutorials teach you how to view someone’s code on Github and propose changes to it. That’s the core of what Github is, and how people collaborate. When you’re ready to venture out, then it’s just a matter of finding a project you want to contribute to.

Here are a few planning and transportation-related repositories I like:
– List of transit-related resources from CUTR at USF: https://buff.ly/3prPcMn
– List of planning-related resources from APA Technology Division: https://buff.ly/3fOpWg9
– OneBusAway: https://buff.ly/3vXTWvU

Still feeling a little lost? Here’s another good overview: http://www.firsttimersonly.com

How to Deepen Your Knowledge of Python

When I started my data science journey, one thing that was intimidating but highly advised was to start an independent project and share my work as soon as possible and invite feedback. It doesn’t have to be perfect. I got an account on Github, a website where people share and collaborate on code. I started publishing my progress on a project to determine what factors influence pedestrian and bike traffic at Myrtle Edwards Park in Seattle:

https://github.com/kellytdunn/Seattle-parks

Photo by Dlanor S on Unsplash

I am still working on my first big project, and perhaps will always be improving it, but I update it when I reach a new milestone. It’s been helpful for me to share it with others and get over my fear of looking like a beginner. Case in point: please have a look at my still-unpolished project and share your feedback with me, and then share with me what you have!

Additionally, I started immersing myself in all things data analysis and statistics, since Python is really just one tool among many to work with data. Absorb as much as you can about this world. Podcasts are a great way to add a little learning into your household chores or morning routine. Here are a few that have given me some great insight into this field:


The Artists of Data Science (https://buff.ly/3wCSKO9)
SuperDataScience (https://buff.ly/2L1HQic)
The Data Chief (https://buff.ly/3fuhra8)

Any other favorite podcasts out there?