GeoPandas 101: Hello World!

Welcome to the first post in our GeoPandas 101 series! A common tradition in the programming world is to start with a "Hello World" program. We'll follow that tradition by setting up our environment and running our first GeoPandas script.

Setup and Installation

Setting up an environment for GeoPandas can be a tedious process, but we'll use Google Colab, which provides a ready-made, OS-independent platform. Colab allows you to run Python code without purchasing expensive hardware or worrying about the complicated process of package installation.

Get Started with Colab

  1. Visit Google Colab.
  2. Click on the "New notebook" button to create a fresh notebook.
  3. You now have a Jupyter notebook, an interactive code editor that allows you to run a piece of code at a time and see the result immediately. This means you can run many experiments easily.
  4. Close the "Release Notes" tab on the right of the screen if you see it.
  5. Click the "Untitled" text in the top left of the page to rename your notebook.

Hello World

To run the "Hello World" program, place your cursor in the space labeled "Start coding or generate with AI" and type:

print("Hello World")

That space is called a code cell. You can run the code by pressing Ctrl + Enter on your keyboard or clicking the play button on the left of the code cell. The code may take a few seconds to complete because Colab needs to instantiate a virtual machine for your session. Subsequent code executions won't take as long.

Import GeoPandas

The next step is to import GeoPandas. Just as you import numpy as np or pandas as pd, we will use the gpd alias for GeoPandas.

import GeoPandas as gpd

Get the Data

copy path to shp

For this series, we will use the data bundle from the Introduction to PostGIS workshop. Here's the direct download link. Extract the contents of the folder after downloading.

Colab sessions are ephemeral, so any dataset you create or upload in a session will be lost once the session closes. Therefore, we will keep our data in a more persistent storage: Google Drive. It has a bit more overhead but is more rewarding in the long run.

  1. On the same Google account you used with Colab, open Drive.
  2. Right-click "My Drive" in the left table of content and select "New Folder." I will name mine GeoPandas_101.
  3. Open your new folder by clicking it from the table of content, then click the "New" button and select "Upload folder."
  4. Navigate to the location of your extracted folder and look for a folder named data. Select it and click "Upload." You will see a progress bar on the bottom right. Wait for it to finish.
  5. Return to your Colab notebook and click the "Mount Drive" icon. When prompted, grant Colab access to your Drive.
  6. Expand the drive folder to reveal your data folder and its content.
  7. Within the data folder, look for the file named nyc_census_blocks.shp, right-click it, and select "Copy path."

Read Data into GeoPandas

Store the path you copied in a variable and use GeoPandas' read_file method to read it into a variable named gdf (short for GeoDataFrame). Unlike pandas, which uses different methods for different file formats such as CSV, Excel, JSON, and HTML, GeoPandas is smarter in this regard. The read_file method can detect and read various OGR file formats such as shapefile, GeoJSON, and even GeoPackage.
data_path = '/content/drive/My Drive/GeoPandas_101/data/nyc_census_blocks.shp'
gdf = gpd.read_file(data_path)

Since a GeoDataFrame is essentially a pandas DataFrame with a geometry column, we can use the head method to inspect it.

gdf.head()

Visualize Data with GeoPandas

screenshot of gdf.plot()

We can create a quick visualization using the plot method:
gdf.plot()

Save Data to Drive

Finally, we can save our file to Drive using the to_file method:

gdf.to_file('/content/drive/My Drive/GeoPandas_101/data/nyc_census_blocks_saved.shp')
You can open your drive folder to confirm the new file.

Summary

Here's what we achieved in this post:

  • Set up GeoPandas in a Colab environment.
  • Uploaded spatial data to Drive and accessed it from Colab.
  • Read spatial data into GeoPandas.
  • Visualized spatial data using GeoPandas.
  • Saved spatial data from GeoPandas to disk.

I'll see you in the next post!

External Resources