Geospatial Technology and Visualization for Transport

Research document for the Global Intelligence System for Transport (GIST) Last updated: 2026-02-09

1. Overview

A Global Intelligence System for Transport requires geospatial infrastructure that can ingest, store, query, and visualize transport data at planetary scale in real-time. This document covers the relevant standards, spatial data infrastructure patterns, and visualization technologies.

2. OGC Standards Relevant to Transport

The Open Geospatial Consortium (OGC) defines interoperability standards for geospatial data. The following are most relevant to a global transport data system.

2.1 OGC API Family (Modern REST-based Standards)

The OGC is transitioning from legacy XML/SOAP standards (WMS, WFS, WCS) to modern RESTful JSON-based APIs:

OGC API - Features (successor to WFS): RESTful API for querying and serving vector geospatial features. JSON/GeoJSON output. Supports filtering (CQL2), spatial queries, pagination. Ideal for serving stop/station/route data.
OGC API - Tiles: Serves pre-rendered map tiles (raster or vector). Supports Mapbox Vector Tile (MVT) format. Efficient for large-scale basemaps and transport network overlays.
OGC API - Maps: Dynamic map rendering API. Less relevant for GIST (prefer vector tiles).
OGC API - Processes: Server-side geoprocessing. Could be used for on-demand spatial analysis (e.g., isochrone generation, network analysis).
OGC API - Records (successor to CSW): Metadata catalog API. Relevant for cataloging transport datasets across the GIST system.
OGC API - EDR (Environmental Data Retrieval): Querying data along transport corridors (e.g., weather conditions along a route).
OGC SensorThings API: IoT sensor data standard. Highly relevant for real-time transport sensor data (vehicle positions, traffic sensors, environmental sensors). Supports MQTT for real-time streaming.

2.2 Legacy OGC Standards (Still Widely Used)

WMS (Web Map Service): Serves rendered map images. Still used by many government mapping agencies.
WFS (Web Feature Service): Serves vector features as GML/XML. Still used by INSPIRE and many European spatial data infrastructures.
WCS (Web Coverage Service): Serves raster/grid data (elevation, satellite imagery).
GML (Geography Markup Language): XML encoding for geographic features. Used by NeTEx, INSPIRE, and many European standards.

2.3 OGC Standards for Transport-Specific Use Cases

CityGML: 3D city model standard. Includes transportation module for roads, railways, waterways in 3D. Relevant for 3D transport visualization.
IndoorGML: Indoor spatial data model. Relevant for in-station navigation and GTFS-Pathways.
Moving Features: Standard for representing objects that move through time and space. Directly relevant to vehicle tracking, vessel tracking, flight paths.

3. EU INSPIRE Directive

3.1 Overview

The INSPIRE Directive (2007/2/EC) establishes a Spatial Data Infrastructure (SDI) for Europe. It mandates that EU member states publish spatial data in interoperable formats using OGC standards.

3.2 Transport-Relevant INSPIRE Themes

Transport Networks (Annex I, Theme 7): Road, rail, water, air, cable transport networks. Defines a common data model for European transport infrastructure. Published via WMS/WFS services.
Addresses (Annex I, Theme 5): Relevant for geocoding transport locations.
Administrative Units (Annex I, Theme 4): Governance boundaries affecting transport jurisdiction.
Hydrography (Annex I, Theme 8): Waterway networks.
Land Use / Land Cover: Context for transport planning.
Utility and Government Services: Public service facilities including transport hubs.

3.3 INSPIRE Transport Network Data Model

The INSPIRE Transport Networks schema defines:

Network elements: Nodes (junctions), Links (road/rail/water segments), Link Sequences
Properties: Form of way, functional class, number of lanes, speed limits, restrictions
Intermodal connections: How different transport networks connect
Temporal attributes: Valid from/to dates for network changes

Format: GML 3.2.1, served via WFS 2.0. Some countries also provide GeoJSON, GeoPackage alternatives.

3.4 Relevance to GIST

INSPIRE provides a harmonized European spatial data layer for transport infrastructure. However:

Implementation quality varies significantly across member states
Update frequency is often poor (annual or less)
Focus is on infrastructure, not services/timetables/real-time
Useful as a base network layer, but must be supplemented with operational data from NeTEx, SIRI, DATEX II, etc.

4. Spatial Data Infrastructure Patterns for Global Systems

4.1 Architecture Patterns

Federated SDI: Each data source maintains its own services; a central catalog discovers and mediates access. This is the INSPIRE model. Pros: Data sovereignty, no centralization bottleneck. Cons: Query performance depends on weakest source, inconsistent quality.

Centralized Data Lake: All data is ingested, transformed, and stored centrally. A single system serves all queries. Pros: Consistent quality, fast queries, unified schema. Cons: Storage and compute costs, data freshness challenges, governance complexity.

Hybrid (Recommended for GIST): Core datasets are centralized and harmonized; real-time data is federated with caching; metadata catalog provides discovery across all sources. This balances performance with data freshness and sovereignty.

4.2 Spatial Databases for Transport

PostGIS (PostgreSQL extension):

The standard open-source spatial database
Supports geometry and geography types, spatial indexes (GiST, SP-GiST), spatial functions (ST_Distance, ST_Intersects, ST_Buffer, ST_Within, etc.)
pgRouting extension adds network routing capabilities (Dijkstra, A*, driving distance, TSP)
Handles millions of features efficiently
Supports time-series with TimescaleDB extension
Excellent integration with GTFS data (multiple tools import GTFS into PostGIS)
Recommended as primary spatial database for GIST

DuckDB + Spatial extension:

In-process analytical database with spatial support
Excellent for analytical queries over large datasets (billions of rows)
Reads Parquet, CSV, GeoJSON, GeoPackage natively
Column-oriented storage for fast aggregation
Good for analytics pipelines but not for transactional/real-time workloads
Recommended for analytical queries and batch processing in GIST

Apache Sedona (GeoSpark):

Distributed spatial computing on Apache Spark/Flink
Handles planetary-scale spatial data
Supports spatial joins, queries, and processing at massive scale
Relevant if GIST needs to process very large spatial datasets (e.g., all global AIS data)

H3 (Uber's Hexagonal Hierarchical Spatial Index):

Not a database but a spatial indexing system
Divides Earth into hierarchical hexagonal cells at multiple resolutions
Excellent for aggregating point data (vehicle positions, trip origins/destinations)
Supported by PostGIS, DuckDB, BigQuery, and most modern tools
Recommended as spatial indexing layer for aggregation and visualization in GIST

Elasticsearch / OpenSearch with Geo:

Full-text search + geospatial queries
Good for geographic search (find stops near a point, within a polygon)
Real-time indexing
Relevant for GIST search functionality

4.3 Cloud-Native Spatial Data Formats

The spatial data ecosystem is rapidly moving to cloud-native formats that support HTTP range requests (no need to download entire files):

Format	Type	Description	Use Case for GIST
GeoParquet	Vector	Parquet files with geometry columns	Analytical queries, data lake storage
FlatGeobuf	Vector	Binary format with spatial index	Fast feature streaming for visualization
PMTiles	Tiles	Single-file tile archive	Serverless map tile serving
Cloud-Optimized GeoTIFF (COG)	Raster	GeoTIFF with internal tiling	Satellite imagery, terrain data
Zarr	Array	Chunked, compressed array storage	Weather/environmental data
GeoArrow	Vector	Apache Arrow with geometry	In-memory analytics, inter-process transfer

4.4 Spatial Data Catalogs

STAC (SpatioTemporal Asset Catalog): Primarily for Earth observation data, but the catalog pattern is applicable to transport datasets. JSON-based metadata with spatial/temporal extent.
CKAN: Open-source data catalog used by many government transport data portals (e.g., data.gov, transport.data.gouv.fr).
OGC API - Records: Standard for spatial data catalog services.

5. Real-Time Geospatial Visualization Technologies

5.1 Deck.gl

Developer: Originally by Uber's visualization team, now part of the Open Visualization Collaboration (vis.gl) under the Linux Foundation / Urban Computing Foundation.

Architecture: WebGL2/WebGPU-powered visualization framework for large-scale data. React-friendly but also works standalone.

Key capabilities for transport:

ScatterplotLayer: Vehicle/stop positions (millions of points)
LineLayer / ArcLayer: Origin-destination flows, route visualization
PathLayer: Vehicle trajectories, route geometries
GeoJsonLayer: General-purpose geospatial rendering
TripsLayer: Animated vehicle movements along paths over time
HexagonLayer / H3HexagonLayer: Spatial aggregation and heatmaps
TileLayer / MVTLayer: Efficient rendering of tiled data
IconLayer: Station icons, vehicle icons
TextLayer: Labels

Performance: Can render millions of data points at 60fps using GPU acceleration. Supports WebGL instanced rendering and binary data transfer.

Data integration:

Reads GeoJSON, binary formats, Arrow/Parquet (via loaders.gl)
Supports tiled data loading (tiles loaded on demand as user pans/zooms)
Real-time data update via efficient state management

Strengths: Extremely high performance, rich layer library, well-maintained, large community, excellent for transport-specific visualizations (trips, flows, networks).

Limitations: WebGL/GPU requirement (not all devices), learning curve, requires custom development (not a turnkey solution).

Assessment for GIST: Primary recommendation for data visualization layer. The TripsLayer, H3 integration, and massive point rendering capability make it ideal for a global transport intelligence system.

5.2 Kepler.gl

Developer: Originally by Uber, now part of vis.gl / Open Visualization Collaboration.

Architecture: Built on top of Deck.gl and MapLibre GL JS. Provides a complete visual analytics application with GUI.

Key capabilities:

No-code/low-code geospatial visualization
Drag-and-drop data import (CSV, GeoJSON, Arrow)
Multiple layer types (point, arc, line, hexbin, heatmap, trip, polygon, cluster, grid, icon, S2, H3)
Time playback for temporal data (trip animations)
Filters and cross-filtering
Split map view
Map styles and basemap switching
Export to image and HTML

Strengths: Fastest path from data to visualization, excellent for exploration and prototyping, shareable visualizations, no coding required for basic use.

Limitations: Less customizable than raw Deck.gl, not designed for production embedded applications (though embeddable as React component), limited interactivity model, not ideal for real-time streaming data.

Assessment for GIST: Excellent for data exploration, prototyping, and analyst-facing dashboards. Could serve as the visual analytics component for internal data exploration. For the production-facing Transport Angel interface, custom Deck.gl development offers more control.

5.3 MapLibre GL JS

Origin: Community fork of Mapbox GL JS after Mapbox changed its license from BSD to proprietary (December 2020). MapLibre is fully open source (BSD-3-Clause).

Architecture: WebGL-based vector map rendering engine. Renders Mapbox Vector Tiles (MVT) using a style specification (compatible with Mapbox Style Spec).

Key capabilities:

High-performance vector tile rendering
Style-driven cartography (programmatic map styling via JSON)
3D terrain and buildings
Smooth animations and camera control
Marker and popup support
Custom layers (integrate with Deck.gl)
Globe view (3D globe rendering)
Right-to-left text rendering (important for global system)
Localization support

Ecosystem:

MapLibre GL JS (web)
MapLibre Native (iOS, Android)
MapLibre RS (Rust rendering engine, in development)
Large plugin ecosystem
Compatible with MapTiler, Stadia Maps, Jawg, and self-hosted tile servers

Integration with Deck.gl: Deck.gl can be used as a MapLibre custom layer, combining MapLibre's basemap rendering with Deck.gl's data visualization layers. This is the recommended architecture for GIST.

Strengths: Fully open source, high performance, beautiful cartography, mature and well-maintained, large community, native mobile support.

Limitations: Vector tile rendering requires tile infrastructure (or use cloud tile services), style spec is complex, WebGL requirement.

Assessment for GIST: Primary recommendation for basemap rendering. MapLibre GL JS + Deck.gl is the strongest open-source stack for real-time transport visualization.

5.4 Other Visualization Tools

MapTiler: Commercial tile hosting and processing with generous free tier. Provides pre-built basemap tiles, geocoding, and SDKs. Uses MapLibre under the hood.

Mapbox: Commercial platform (no longer open source). Higher performance and features than MapLibre in some areas, but proprietary and expensive at scale.

Leaflet: Lightweight, simple map library. Good for basic maps. Not suitable for GIST's performance requirements (no WebGL, struggles with large datasets).

OpenLayers: Full-featured open-source web map library. Supports many formats and projections. More complex API than Leaflet. Good OGC standards support. Less performant than MapLibre for vector tiles.

CesiumJS: 3D globe visualization. WebGL-based. Excellent for 3D terrain, flight paths, satellite tracking. Relevant if GIST needs 3D globe visualization for aviation/maritime.

Felt: Commercial collaborative mapping platform. Good for team workflows but not suitable for embedded/custom applications.

6. Real-Time Data Streaming Infrastructure

6.1 Streaming Patterns for Transport Data

Real-time transport visualization requires a streaming data pipeline:

Data Sources --> Ingestion --> Processing --> Serving --> Visualization
(GTFS-RT,     (Kafka,       (Flink,      (WebSocket, (Deck.gl,
 SIRI,        Pulsar,       Spark        SSE,        MapLibre)
 AIS,         MQTT)         Streaming,   HTTP/2)
 ADS-B)                     Kafka
                            Streams)

6.2 Message Brokers

Apache Kafka:

Industry standard for high-throughput event streaming
Excellent for ingesting multiple transport data feeds
Supports exactly-once semantics, compaction, replay
Topic-per-source or topic-per-region partitioning for transport data
Recommended as primary message broker for GIST

Apache Pulsar:

Alternative to Kafka with multi-tenancy, geo-replication
Built-in tiered storage (hot/warm/cold)
Native geo-replication (relevant for global system)

MQTT:

Lightweight IoT messaging protocol
Used by OGC SensorThings API
Relevant for vehicle/sensor telemetry
Lower overhead than Kafka for edge devices

Redis Streams:

In-memory stream processing
Very low latency
Good for real-time caching layer between backend and WebSocket server

6.3 Client-Side Real-Time Delivery

WebSocket: Full-duplex communication. Best for continuous real-time updates (vehicle positions). Maintains persistent connection.

Server-Sent Events (SSE): One-directional push from server. Simpler than WebSocket. Good for alerts, status updates. Auto-reconnect built in.

HTTP/2 Server Push / HTTP/3: Multiplexed streaming over HTTP. Emerging alternative to WebSocket for some use cases.

gRPC Streaming: Protocol Buffer-based bidirectional streaming. Efficient binary format. Good for service-to-service communication, less common for browser clients (requires gRPC-web proxy).

6.4 Spatial Streaming Patterns

Geospatial subscription: Clients subscribe to updates within a geographic bounding box or polygon. As vehicles enter/exit the area, subscriptions automatically route relevant updates.

Spatial partitioning: Data is partitioned by geographic region (e.g., H3 cells, S2 cells, geohash prefixes) in the message broker. Consumers subscribe to relevant partitions based on the user's viewport.

Level-of-detail streaming: At low zoom levels, send aggregated data (heatmaps, cluster counts). At high zoom levels, send individual features. Adapts data volume to viewport.

7. Tile Infrastructure for Global Scale

7.1 Vector Tile Pipeline

For serving transport network data at global scale:

Data processing: Convert transport network data to vector tiles using tippecanoe (Mapbox) or martin (PostGIS-native tile server).
Tile storage: Store as PMTiles (single file on cloud storage) or in MBTiles (SQLite).
Tile serving:
- PMTiles on S3/R2/GCS: Serverless, uses HTTP range requests. No tile server needed. Very cost-effective.
- Martin (Rust): Dynamic vector tile server from PostGIS. Real-time tile generation from database queries.
- pg_tileserv (PostGIS): Another PostGIS tile server option.
- TileServer GL: Serves pre-generated tiles and can render raster tiles from vector sources.

7.2 Recommended Tile Architecture for GIST

Static layers (transport networks, stop locations, administrative boundaries): Pre-generate as PMTiles, serve from cloud object storage. Update periodically (daily/weekly).
Dynamic layers (vehicle positions, real-time status): Serve directly from PostGIS via Martin, or push to clients via WebSocket as GeoJSON/binary features.
Basemap: Use MapTiler or self-hosted OpenMapTiles for basemap tiles.

8. Geocoding and Search

8.1 Open-Source Geocoding

Pelias (Linux Foundation): Modular open-source geocoder. Supports multiple data sources (OSM, Who's on First, OpenAddresses, GeoNames). Elasticsearch-based.
Nominatim: OSM's geocoder. PostgreSQL-based. Good quality but single-source (OSM only).
Photon: Elasticsearch-based geocoder using OSM data. Fast, supports reverse geocoding.

8.2 Transport-Specific Search

For GIST, users need to search for:

Stop/station names (fuzzy matching, multilingual)
Route names and numbers
Place names and addresses
Operator names

Recommendation: Use Elasticsearch/OpenSearch with custom analyzers for transport entity search. Index stop/station names from GTFS/NeTEx alongside place names from OSM/geocoding services. Support multilingual search with language-specific analyzers.

9. Recommended Technology Stack for GIST Geospatial Infrastructure

Component	Technology	Rationale
Basemap rendering	MapLibre GL JS	Open source, high performance, mobile support
Data visualization	Deck.gl	GPU-accelerated, transport-specific layers
Spatial database	PostGIS	Standard, mature, routing support (pgRouting)
Analytical database	DuckDB + Spatial	Fast analytics, cloud-native formats
Spatial indexing	H3	Hexagonal aggregation, multi-resolution
Tile serving	PMTiles (static) + Martin (dynamic)	Serverless + real-time
Real-time streaming	Apache Kafka + WebSocket	High throughput + browser delivery
Search	Elasticsearch / OpenSearch	Full-text + geospatial
Geocoding	Pelias or Photon	Open source, multilingual
Tile processing	Tippecanoe	Industry standard for tile generation
Data formats	GeoJSON (API), GeoParquet (storage), FlatGeobuf (streaming), MVT (tiles)	Best format for each use case

10. Scalability Considerations for Global System

10.1 Data Volume Estimates

Data Source	Approximate Volume	Update Frequency
GTFS Schedule (global)	~50-100 GB (all agencies)	Weekly/monthly
GTFS-RT (global)	~10-50 GB/day	Every 15-60 seconds
AIS (global)	~20-50 GB/day	Every 2-30 seconds
ADS-B (global)	~5-20 GB/day	Every 1-5 seconds
GBFS (global)	~1-5 GB/day	Every 30-60 seconds
DATEX II (European)	~5-20 GB/day	Every 1-5 minutes
OSM transport data	~10-20 GB (extract)	Minutely diffs available

10.2 Key Scaling Strategies

Spatial partitioning: Partition data by geographic region (continental, national, or H3 cell level).
Temporal partitioning: Hot data (real-time, last 24 hours) in fast storage; warm data (last 30 days) in analytical storage; cold data (historical) in archival storage.
Tile-based delivery: Pre-compute and cache tiles to avoid per-request computation.
Edge caching: Use CDN for static/semi-static data (schedule data, basemap tiles).
Viewport-aware loading: Only load data within the user's current viewport and zoom level.
Progressive loading: Load coarse data first, refine as user zooms in.

11. References

OGC API standards: https://ogcapi.ogc.org/
INSPIRE Geoportal: https://inspire-geoportal.ec.europa.eu/
Deck.gl: https://deck.gl/
Kepler.gl: https://kepler.gl/
MapLibre: https://maplibre.org/
H3: https://h3geo.org/
PMTiles: https://protomaps.com/docs/pmtiles
Martin tile server: https://martin.maplibre.org/
Tippecanoe: https://github.com/felt/tippecanoe
GeoParquet: https://geoparquet.org/
Pelias geocoder: https://github.com/pelias/pelias
OGC SensorThings API: https://www.ogc.org/standard/sensorthings/
Apache Kafka: https://kafka.apache.org/
PostGIS: https://postgis.net/
DuckDB Spatial: https://duckdb.org/docs/extensions/spatial.html