GeoPandas 101: Basic Geometric Operations in GeoPandas

featured image for geopandas 101 basic geometry operations

Welcome back to our GeoPandas for beginners series. In this post, you’ll learn how to perform basic geometry operations in GeoPandas using NYC blocks data. We’ll cover calculating area and length, finding bounds and centroids, buffering, and simplifying geometries. Feel free to experiment boldly; breaking things along the way is all part of the learning process!

Introduction to GeoPandas Geometries

Before diving into geometry operations, let’s briefly introduce how GeoPandas handles geometries.

  • In GeoPandas, spatial data is stored in a GeoSeries. 
  • GeoSeries is essentially a pandas Series where each entry is a geometry object (e.g., points, lines, or polygons). 
  • The GeoSeries has several attributes and methods for performing geometric operations. 
  • A GeoDataFrame is simply a DataFrame that has a GeoSeries. If you remove the GeoSeries column from a GeoDataFrame, it becomes a basic pandas DataFrame.

Loading the NYC Blocks Data

If you followed our previous post, you should already have the NYC blocks data loaded into your Google Drive. Simply load the dataset into a GeoDataFrame as demonstrated earlier, and you’ll be ready to perform geometric operations!

import geopandas as gpd
blocks = gpd.read_file("/content/drive/MyDrive/geopandas_101/data/nyc_census_blocks.shp")

1. Measuring Area, Length, and Bounds

GeoPandas provides direct access to several essential geometry properties:

Area

For polygonal geometries, you can calculate the area.

blocks['area'] = blocks.geometry.area

Length

For LineString geometries, .length gives the total length, and for polygons, it returns the perimeter.

blocks['length'] = blocks.geometry.length

Bounds

The bounds attribute gives a bounding box around each geometry, providing the minimum and maximum x and y coordinates.

blocks['bounds'] = blocks.geometry.bounds

2. Finding the Centroid

The centroid is the geometric center of a geometry, useful for labeling or visualization.

blocks['centroid'] = blocks.geometry.centroid

For example, if you’re mapping NYC blocks, centroids can help with label placement.

import matplotlib.pyplot as plt


fig, ax = plt.subplots()

blocks.iloc[50:51, :].plot(ax=ax, color='lightblue', edgecolor='black')

blocks.iloc[50:51, :].centroid.plot(ax=ax, color='red')

3. Calculating Distance Between Geometries

To find the distance between two geometries, use the .distance() method. Here’s how you can calculate the distance from each Brooklyn block to a block in the Bronx.

bronx = blocks[blocks.BORONAME == 'The Bronx']

rand_bronx_geom = bronx.iloc[1, :].geometry

blocks[blocks.BORONAME == 'Brooklyn'].distance(rand_bronx_geom)

4. Buffering Geometries

Creating a buffer around a geometry is common for defining zones of influence, like a 20 metre radius around a block.

fig, ax = plt.subplots()

blocks[blocks.BORONAME == 'Brooklyn'].iloc[50:51, :].buffer(20).plot(ax=ax, color='red')

blocks[blocks.BORONAME == 'Brooklyn'].iloc[50:51, :].plot(ax=ax)

5. Geometry Simplification

For large datasets, simplifying geometries can reduce computational demands. Simplifying reduces the number of vertices, useful for plotting and web applications.

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

blocks.iloc[16:17, :].plot(ax=ax1, color='red')

ax1.set_title("Original Geometry")

blocks.iloc[16:17, :].simplify(tolerance=15).plot(ax=ax2, color='red')

ax2.set_title("Simplified Geometry")

Putting It All Together

Here’s a complete example that loads data, calculates geometric properties, and performs basic operations:

from shapely.geometry import Point

# Load data
blocks = gpd.read_file('path_to_nyc_blocks_data.shp')

# Calculate area, length, and bounds
blocks['area'] = blocks.geometry.area
blocks['length'] = blocks.geometry.length
blocks['bounds'] = blocks.geometry.bounds

# Find centroids
blocks['centroid'] = blocks.geometry.centroid

# Distance to a reference point
reference_point = Point(0, 0)
blocks['distance_to_ref'] = blocks.geometry.distance(reference_point)

# Buffering and simplifying
blocks['buffered'] = blocks.geometry.buffer(10)
blocks['simplified'] = blocks.geometry.simplify(tolerance=0.1)

Conclusion

Mastering these basic geometric operations is a powerful step in spatial analysis. From measuring distances to creating buffer zones, each of these tools forms the foundation for more advanced tasks in GeoPandas. In the next post, you'll explore Coordinate Reference Systems (CRS), an essential part of any spatial project. See you in the next post!