Welcome back to our GeoPandas for beginners series. In this post, you’ll learn how to perform basic geometry operations in GeoPandas using NYC blocks data. We’ll cover calculating area and length, finding bounds and centroids, buffering, and simplifying geometries. Feel free to experiment boldly; breaking things along the way is all part of the learning process!
Introduction to GeoPandas Geometries
Before diving into geometry operations, let’s briefly introduce how GeoPandas handles geometries.
- In GeoPandas, spatial data is stored in a GeoSeries.
- GeoSeries is essentially a pandas Series where each entry is a geometry object (e.g., points, lines, or polygons).
- The GeoSeries has several attributes and methods for performing geometric operations.
- A GeoDataFrame is simply a DataFrame that has a GeoSeries. If you remove the GeoSeries column from a GeoDataFrame, it becomes a basic pandas DataFrame.
Loading the NYC Blocks Data
If you followed our previous post, you should already have the NYC blocks data loaded into your Google Drive. Simply load the dataset into a GeoDataFrame as demonstrated earlier, and you’ll be ready to perform geometric operations!
import geopandas as gpd
blocks = gpd.read_file("/content/drive/MyDrive/geopandas_101/data/nyc_census_blocks.shp")
1. Measuring Area, Length, and Bounds
GeoPandas provides direct access to several essential geometry properties:
Area
For polygonal geometries, you can calculate the area.
blocks['area'] = blocks.geometry.area
Length
For LineString
geometries, .length
gives the total length, and for polygons, it returns the perimeter.
blocks['length'] = blocks.geometry.length
Bounds
The bounds
attribute gives a bounding box around each geometry, providing the minimum and maximum x and y coordinates.
blocks['bounds'] = blocks.geometry.bounds
2. Finding the Centroid
The centroid is the geometric center of a geometry, useful for labeling or visualization.
blocks['centroid'] = blocks.geometry.centroid
For example, if you’re mapping NYC blocks, centroids can help with label placement.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
blocks.iloc[50:51, :].plot(ax=ax, color='lightblue', edgecolor='black')
blocks.iloc[50:51, :].centroid.plot(ax=ax, color='red')
3. Calculating Distance Between Geometries
To find the distance between two geometries, use the .distance()
method. Here’s how you can calculate the distance from each Brooklyn block to a block in the Bronx.
bronx = blocks[blocks.BORONAME == 'The Bronx']
rand_bronx_geom = bronx.iloc[1, :].geometry
blocks[blocks.BORONAME == 'Brooklyn'].distance(rand_bronx_geom)
4. Buffering Geometries
Creating a buffer around a geometry is common for defining zones of influence, like a 20 metre radius around a block.
fig, ax = plt.subplots()
blocks[blocks.BORONAME == 'Brooklyn'].iloc[50:51, :].buffer(20).plot(ax=ax, color='red')
blocks[blocks.BORONAME == 'Brooklyn'].iloc[50:51, :].plot(ax=ax)
5. Geometry Simplification
For large datasets, simplifying geometries can reduce computational demands. Simplifying reduces the number of vertices, useful for plotting and web applications.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))
blocks.iloc[16:17, :].plot(ax=ax1, color='red')
ax1.set_title("Original Geometry")
blocks.iloc[16:17, :].simplify(tolerance=15).plot(ax=ax2, color='red')
ax2.set_title("Simplified Geometry")
Putting It All Together
Here’s a complete example that loads data, calculates geometric properties, and performs basic operations:
from shapely.geometry import Point
# Load data
blocks = gpd.read_file('path_to_nyc_blocks_data.shp')
# Calculate area, length, and bounds
blocks['area'] = blocks.geometry.area
blocks['length'] = blocks.geometry.length
blocks['bounds'] = blocks.geometry.bounds
# Find centroids
blocks['centroid'] = blocks.geometry.centroid
# Distance to a reference point
reference_point = Point(0, 0)
blocks['distance_to_ref'] = blocks.geometry.distance(reference_point)
# Buffering and simplifying
blocks['buffered'] = blocks.geometry.buffer(10)
blocks['simplified'] = blocks.geometry.simplify(tolerance=0.1)
Conclusion
Mastering these basic geometric operations is a powerful step in spatial analysis. From measuring distances to creating buffer zones, each of these tools forms the foundation for more advanced tasks in GeoPandas. In the next post, you'll explore Coordinate Reference Systems (CRS), an essential part of any spatial project. See you in the next post!
Follow Me