Street-Block GNN Prediction
Github
2025
Graph Neural Network
Geometric Deep Learning
Geospatial Data Science
Tools:
Python + Pytorch
PostgreSQL Database
QGIS
Abstract
This project extends the Street-Block Machine Learning study under the same fundamental premise:
how much urban structure can be inferred solely from street-block geometry, independent of
socioeconomic or land-use data. Using street-block data from the United States Census Bureau,
the study focuses on the top 10 U.S. metropolitan areas by GDP, ensuring both economic relevance
and geometric diversity. The ground-truth label is the UR20 attribute from census street-block data,
framing the task as a binary classification problem (urban vs. rural). While the previous ML approach
relied on four handcrafted geometric features—area, isoperimetric quotient, rectangularity,
and solidity—this project asks a more fundamental question: can a model learn urbanity directly
from geometry itself? Street blocks are therefore represented as graphs derived from polygon
topology and geometry, and a Graph Neural Network (GNN) is trained to classify urban versus
rural blocks based purely on learned geometric representations. By replacing feature engineering
with end-to-end geometric learning, the project evaluates the expressive limits of deep learning
for urban form inference and clarifies what information is intrinsically embedded in street-block geometry alone.
Data Sources
Original Data Source:
United States Census Bureau
Study Scope: Top 10 U.S. Metropolitan Areas by GDP
https://www2.census.gov/geo/tiger/TIGER2025/TABBLOCK20/
Spatial Extent: Blocks within a 30-mile radius from each city center
Study Scope: Top 10 U.S. Metropolitan Areas by GDP
https://www2.census.gov/geo/tiger/TIGER2025/TABBLOCK20/
Spatial Extent: Blocks within a 30-mile radius from each city center
Workflow
QGIS:
Loaded shapefiles, clipped 30-mile radius per city, calculated base geometric attributes, and exported processed layers.
PostgreSQL/PostGIS: Stored city-wise processed street-block geometries and UR20 labels, enabling consistent querying and reproducible splits.
Python + PyTorch Geometric: Loaded 10 cities, converted blocks into graph representations, trained a GNN on 9 cities, used Houston as a fixed (non-random) validation set, and tested generalization on a new unseen city: Philadelphia.
PostgreSQL/PostGIS: Stored city-wise processed street-block geometries and UR20 labels, enabling consistent querying and reproducible splits.
Python + PyTorch Geometric: Loaded 10 cities, converted blocks into graph representations, trained a GNN on 9 cities, used Houston as a fixed (non-random) validation set, and tested generalization on a new unseen city: Philadelphia.
Street Block to Graph
Street blocks can be modeled either as rasterized images or as graphs. An image representation forces a grid-based
approximation and often loses precise corner-to-corner structure, while a graph preserves the true topology of the
polygon, capturing the explicit relationships between corners (nodes) and edges (connections).
Each street block is
converted into a planar graph where polygon vertices become nodes and consecutive boundary segments become edges.
Node features are stored in an N×5 matrix, where each of the N vertices contains five attributes: (x, y) coordinates,
turning angle θ at the vertex, and the lengths of the two adjacent boundary edges (l1, l2).
The conversion pipeline standardizes vertex ordering, removes duplicate/near-duplicate points, computes angles and edge lengths from the cleaned polygon boundary, builds the edge index from sequential vertex connectivity (including wrap-around from last to first), and outputs a graph object consisting of x ∈ ℝ^(N×5) and edge_index ∈ ℕ^(2×E) for GNN training.
The conversion pipeline standardizes vertex ordering, removes duplicate/near-duplicate points, computes angles and edge lengths from the cleaned polygon boundary, builds the edge index from sequential vertex connectivity (including wrap-around from last to first), and outputs a graph object consisting of x ∈ ℝ^(N×5) and edge_index ∈ ℕ^(2×E) for GNN training.
image vs graph
graph structure
Neural Network Iteration
The deep learning model is built around a Graph Neural Network (GNN) that learns hierarchical geometric
representations of street-block graphs. In Iteration 1, node embeddings are projected through a graph
convolution layer into an N × 64 latent space, followed by a graph-level readout and a multilayer
perceptron (MLP) with architecture 64 → 128 → 256 → 1 for binary classification. This configuration
prioritizes rapid convergence and controlled model capacity to establish a stable performance baseline.
In Iteration 2, model capacity is increased to test whether richer geometric abstractions improve generalization. Graph convolution layers output N × 128 node embeddings, and the downstream MLP is restructured as 128 → 128 → 128 → 128 → 1, emphasizing deeper but more uniform feature transformations. Across both iterations, the network is trained end-to-end using supervised learning with a binary classification objective, global pooling to aggregate node-level information into block-level embeddings, and consistent training protocols to isolate the impact of architectural changes. This iterative setup enables a direct comparison between compact and high-capacity graph representations and clarifies the trade-off between expressiveness and overfitting when learning urban form purely from geometry.
In Iteration 2, model capacity is increased to test whether richer geometric abstractions improve generalization. Graph convolution layers output N × 128 node embeddings, and the downstream MLP is restructured as 128 → 128 → 128 → 128 → 1, emphasizing deeper but more uniform feature transformations. Across both iterations, the network is trained end-to-end using supervised learning with a binary classification objective, global pooling to aggregate node-level information into block-level embeddings, and consistent training protocols to isolate the impact of architectural changes. This iterative setup enables a direct comparison between compact and high-capacity graph representations and clarifies the trade-off between expressiveness and overfitting when learning urban form purely from geometry.
Neural Network Model Building Process (Graph Convolutional + Neural Network)
Treating Imbalanced Data
Imbalanced data: by count vs by area
The dataset exhibits a severe class imbalance: urban blocks account for approximately 98% of all street blocks,
while rural blocks represent a small minority. However, when evaluated by total area rather than count, the
imbalance is less extreme—urban blocks cover roughly twice the total area of rural blocks, reflecting the
much larger size of rural street blocks. This discrepancy highlights that block count alone is a misleading
indicator of spatial dominance.
Synthetic oversampling methods such as SMOTE were intentionally not used. Street-block geometry is highly sensitive to small perturbations, and generating synthetic samples risks producing geometries that are topologically invalid or physically implausible, which would corrupt the learning signal in a graph-based model.
Instead, class imbalance is addressed by imposing area-based weighting during loss computation, so that misclassification penalties are proportional to the spatial footprint of each block rather than the raw number of samples. This approach aligns the optimization objective with spatial reality, reduces bias toward over-predicting the majority urban class, and preserves the integrity of original geometries. By separating data representation from error weighting, the model maintains consistent graph inputs while correcting imbalance at the learning stage, enabling more meaningful generalization to unseen cities.
Synthetic oversampling methods such as SMOTE were intentionally not used. Street-block geometry is highly sensitive to small perturbations, and generating synthetic samples risks producing geometries that are topologically invalid or physically implausible, which would corrupt the learning signal in a graph-based model.
Instead, class imbalance is addressed by imposing area-based weighting during loss computation, so that misclassification penalties are proportional to the spatial footprint of each block rather than the raw number of samples. This approach aligns the optimization objective with spatial reality, reduces bias toward over-predicting the majority urban class, and preserves the integrity of original geometries. By separating data representation from error weighting, the model maintains consistent graph inputs while correcting imbalance at the learning stage, enabling more meaningful generalization to unseen cities.
Final Result and Implication
Final Prediction (Philadelphia street block)
The deep learning approach achieves a 94.55% accuracy, only marginally improving over the 93.74% accuracy obtained
from the previous logistic regression model. A closer examination of the confusion matrices reveals where this
difference originates. Logistic regression strongly favors the majority class, predicting urban blocks far more
aggressively due to the overwhelming class count imbalance. This behavior results in very high urban recall but
weak rural recognition, with a large number of rural blocks incorrectly classified as urban.
The neural network demonstrates a more balanced error distribution. While overall accuracy gains are modest,
the NN improves both precision and recall for the minority rural class, reducing false urban predictions and
yielding a cleaner separation between urban and rural blocks. This indicates that the GNN is better at
extracting subtle geometric signals that distinguish large rural blocks from dense urban fabrics, even under severe imbalance.
Despite this improvement, the results also expose a clear limitation. Across multiple training epochs and
architectural iterations, the loss and error plateau quickly and do not decrease substantially. This suggests
that street-block geometry alone has a bounded explanatory power for urban classification. While the GNN is
more expressive than logistic regression, it is also less efficient and more computationally expensive,
delivering only incremental gains. The outcome implies that deep learning does not fundamentally overcome
the information ceiling imposed by geometry-only inputs; rather, it slightly refines classification within
those limits. This reinforces the conclusion that urbanity is only partially encoded in block shape, and
that richer contextual or semantic data would be required for significant performance gains beyond this threshold.