Geospatial Feature Engineering with H3 for Risk Modeling

How Uber's hexagonal indexing system turns raw lat/lon coordinates into high-signal features for fraud detection, credit risk, and beyond.

geospatialfeature-engineeringfraud-detectionh3machine-learning

Geospatial Feature Engineering with H3 for Risk Modeling

Your fraud model receives a transaction at coordinates 40.7128, -74.0060. What does it do with those numbers? Nothing useful, most likely. Raw latitude and longitude are continuous floats that are essentially unique per event. A model cannot learn that "this neighborhood is risky" when every observation has a different address on the number line.

The fix is spatial aggregation, bucketing nearby coordinates into regions, then computing features per region. The question is how to bucket. Rectangular grids are the naive choice, but they introduce distortion: cells near the poles are smaller than cells at the equator, corner-adjacent cells are farther apart than edge-adjacent ones, and the boundaries create artifacts in spatial statistics.

Hexagonal grids solve all three problems. Every hex has six neighbors at equal distance. There are no corners, no diagonal ambiguity, no edge distortion. This is why Uber built H3, a hierarchical hexagonal indexing system, and why it has become the standard for geospatial feature engineering in production ML systems.

Enter H3: Hexagonal Hierarchical Indexing

H3 does one thing and does it well: it maps any point on Earth to a hexagonal cell at a chosen resolution. You pass in a latitude, a longitude, and a resolution level (0 through 15), and you get back a 64-bit index string that uniquely identifies that hex. Resolution 0 gives you 122 base cells covering the planet. Resolution 15 gives you sub-meter precision. Each step down multiplies the cell count by roughly seven.

For risk modeling, resolutions 7 through 10 hit the sweet spot. Resolution 9 produces hexes of roughly 0.1 km², small enough to capture neighborhood-level variation, large enough to accumulate meaningful transaction counts.

Drag the slider below to see how the same set of points maps to different hex sizes. At resolution 7, a handful of large hexes cover the area. By resolution 11, hundreds of tiny cells carve out individual blocks.

Interactive: H3 Resolution ExplorerResolution 9
Loading map...
711

The chart below simulates a city-scale risk heatmap at resolution 9. Each hexagon represents an H3 cell, colored by its computed risk score, from low risk through moderate to high risk. Notice how the spatial clustering emerges naturally: high-risk cells tend to neighbor other high-risk cells.

Adaptive Resolution: Matching Density to Data

A single resolution is wasteful. In a dense urban core, resolution 9 gives you thousands of transactions per hex, plenty of statistical power. But apply the same resolution to a rural area and most cells contain zero or one observation. The features you compute for those cells are noise.

Adaptive resolution solves this by using fine hexes where data is dense and coarse hexes where it is sparse. H3's compactCells function does the heavy lifting: given a set of hex indexes, it automatically replaces any complete group of seven child cells with their parent. The result is a mixed-resolution tessellation where every cell has enough observations to produce stable aggregate features.

The chart below illustrates the difference. On the left, a uniform grid wastes resolution on sparse areas. On the right, adaptive compaction ensures every cell has meaningful observation counts.

In the uniform grid, every cell is the same size (resolution 9). The dense core has plenty of observations, but edge cells contain zero or single-digit counts. Any features computed from those sparse cells are noise. The adaptive grid mixes three resolutions: fine resolution 10 in the dense center, medium resolution 9 in the surrounding ring, and coarse resolution 8 on the sparse edges. Each step up roughly multiplies the cell area by 7x, ensuring every cell accumulates enough observations for reliable statistics.

The Feature Engineering Pipeline

With hex indexes assigned, the transformation from raw coordinates to ML-ready features follows a straightforward pipeline:

  1. Index. Convert each event's lat/lon to an H3 hex index at a fine starting resolution (e.g., resolution 10).
  2. Adapt resolution. Count observations per hex. Any cell below a minimum threshold (e.g., 30 transactions) gets coarsened: replace the fine index with its parent via cellToParent. Repeat upward through resolutions until every cell meets the threshold, or apply compactCells across the full set. This is the adaptive resolution step from the previous section, ensuring stable statistics everywhere.
  3. Aggregate. Group events by their (now possibly mixed-resolution) hex index and compute statistics: fraud rate, transaction velocity, average amount, unique card count, time-of-day distribution.
  4. Expand neighborhoods. Use gridDisk(h3Index, k) to pull in features from surrounding hexes. A k-ring of 1 gives you 6 neighbors; k=2 gives 18. The "average fraud rate in adjacent hexes" is a powerful spatial lag feature that captures risk contagion.
  5. Join back. Attach hex-level features to each individual event as contextual signals.

The result: every transaction now carries not just its own attributes, but a rich spatial context. "This card was used in a hex where 8% of transactions in the past 30 days were fraudulent, surrounded by hexes averaging 5% fraud rate" is enormously more informative than "this transaction happened at 40.7128, -74.0060."

The chart below visualizes this transformation. The faint scatter shows raw transaction coordinates, a cloud of points with no discernible structure. The larger hex centroids show the aggregated view, where marker size encodes transaction volume and color encodes fraud rate.

Toggle the legend to isolate each layer. The raw transactions are noise. The hex aggregation reveals structure, a clear hotspot in the center-right of the grid where both transaction density and fraud rate peak.

Impact on Model Performance

In a gradient-boosted fraud detection model trained on 12 months of transaction data, H3-derived spatial features consistently rank among the top predictors. The chart below uses simulated data to illustrate a typical pattern seen in production fraud models, not results from a specific experiment.

Four of the top six features are spatial. Direct H3 aggregates like hex_fraud_rate_30d and hex_txn_velocity_30d capture "what happens in this cell." Derived spatial features like hex_neighbor_fraud_rate and distance_from_home_hex capture relational context, how this cell compares to its surroundings and to the cardholder's typical location. The non-spatial features are still valuable, but the spatial context provides signal that no amount of transaction-level feature engineering can replicate.

Beyond Fraud: Other Domains

The H3 feature engineering pattern transfers directly to any domain where events have coordinates:

  • Credit risk. Aggregate default rates, income proxies, and property values per hex to build neighborhood-level credit features. Particularly valuable in markets where bureau data is sparse.
  • Insurance. Claim frequency and severity by hex for auto, property, and health insurance. H3's hierarchical resolution matches naturally to the urban/rural density divide in insurance portfolios.
  • Mobility and logistics. Demand forecasting, surge pricing, and route optimization. Uber built H3 for exactly this use case. The hexagonal grid's uniform adjacency makes pathfinding and diffusion models cleaner than on rectangular grids.
  • Public health. Disease surveillance, environmental exposure modeling, and resource allocation. The adaptive resolution capability is critical here: dense urban areas need fine-grained monitoring while rural regions can use coarser cells.

The common thread: raw coordinates are high-cardinality noise. Hexagonal aggregation transforms them into stable, interpretable, high-signal features. The hierarchy lets you match resolution to data density. And the uniform adjacency makes spatial neighborhood features trivially computable.

Key Takeaways

  • Raw lat/lon coordinates are nearly useless as ML features. They are unique per event and carry no generalizable signal. Spatial aggregation into discrete cells is required to make location data learnable.
  • Hexagonal grids beat rectangular ones. Uniform adjacency, no corner effects, no edge distortion. H3 provides a battle-tested implementation with a clean API and hierarchical resolution support.
  • Adaptive resolution prevents sparse-cell noise. Use fine hexes in dense areas and coarse hexes in sparse areas. H3's compactCells automates this. Set a minimum observation threshold per cell to ensure statistical stability.
  • Spatial lag features are as important as direct aggregates. The fraud rate in surrounding hexes (via gridDisk) often outranks the fraud rate in the transaction's own hex. Risk rarely respects cell boundaries.
  • Guard against target encoding leakage. Hex-level fraud rates are powerful precisely because they encode the target, which means temporal and same-row leakage are real risks. Use trailing windows and leave-one-out encoding.