Open Source Tools
Frameworks, libraries, and platforms for transport data
Open-Source Tools for Transport Data Systems
Research document for the Global Intelligence System for Transport (GIST) Last updated: 2026-02-09
1. Overview
This document catalogs the open-source tools, platforms, and libraries relevant to building a global transport intelligence system. These tools span data processing, routing, visualization, data aggregation, and analysis.
2. Routing and Trip Planning Engines
2.1 OpenTripPlanner (OTP)
Repository: https://github.com/opentripplanner/OpenTripPlanner License: LGPL v3 Language: Java
Description: The leading open-source multimodal trip planner. Originally developed by OpenPlans, now maintained by a consortium of transit agencies and consultants (Entur, IBI Group, Conveyal, Leonard, and others).
Capabilities:
- Multimodal routing: transit + walking + cycling + car
- Consumes GTFS, GTFS-RT, NeTEx, OSM data
- Isochrone generation (reachable area calculations)
- Park-and-ride, bike-and-ride routing
- Accessibility-aware routing (wheelchair routing)
- Real-time trip updates (GTFS-RT integration)
- Flexible transit / demand-responsive routing (GTFS-Flex support)
- Fare calculation
- REST API (OTP2) and GraphQL API
Architecture (OTP2):
- Graph-based routing on a street + transit network graph
- Built at startup from GTFS + OSM data
- In-memory graph (can be large: national-scale graphs require substantial RAM)
- RAPTOR algorithm for transit routing (fast, range-based)
- A* for street routing
Deployment: Single JAR file. Can be containerized (Docker). Memory requirements scale with geographic coverage (city: 2-4 GB, country: 8-32 GB, continent: requires graph splitting).
Relevance to GIST:
- Could serve as the routing engine for Transport Angel's journey planning
- Can be deployed per-region with multiple instances
- NeTEx support makes it suitable for European data
- Active development, well-maintained
- Does NOT aggregate data (requires pre-built feeds)
Limitations:
- Memory-intensive for large geographic areas
- Graph build time can be long (hours for large regions)
- Not designed as a data aggregation platform
- Limited analytical/statistical capabilities
2.2 Valhalla
Repository: https://github.com/valhalla/valhalla License: MIT Language: C++
Description: Open-source routing engine for road/path networks. Originally developed by Mapzen, now maintained by community (with contributors from Mapbox, Microsoft, and others).
Capabilities:
- Turn-by-turn routing (driving, cycling, walking, multimodal)
- Isochrone generation
- Map matching (snap GPS traces to road network)
- Time-distance matrices (origin-destination calculations)
- Elevation-aware routing
- Optimized route (traveling salesman)
- Very fast and memory-efficient
Data sources: OpenStreetMap, GTFS (for multimodal), custom data via Valhalla tiles.
Relevance to GIST: Excellent for road/path routing, isochrone calculations, and distance matrix computations. More performant than OTP for road-only routing. Could complement OTP (Valhalla for road network, OTP for transit).
2.3 OSRM (Open Source Routing Machine)
Repository: https://github.com/Project-OSRM/osrm-backend License: BSD-2-Clause Language: C++
Description: Ultra-fast routing engine for road networks, using Contraction Hierarchies and Multi-Level Dijkstra. Originally by KIT (Karlsruhe Institute of Technology).
Capabilities:
- Fastest open-source road routing engine
- Turn-by-turn navigation
- Distance matrices
- Map matching
- Trip optimization
Limitations: Road network only (no transit), limited customization of routing profiles compared to Valhalla.
Relevance to GIST: Useful for fast road distance/time calculations at scale (e.g., calculating accessibility metrics for thousands of origins).
2.4 Navitia
Repository: https://github.com/hove-io/navitia License: AGPL v3 Language: C++ (core), Python (API wrapper)
Description: Open-source transit platform developed by Hove (formerly Kisio/Keolis Digital). Powers transit routing for many French and European cities.
Capabilities:
- Multimodal journey planning
- Isochrone computation
- Stop/line/network data API
- Real-time updates (SIRI, GTFS-RT)
- NeTEx and GTFS ingestion
- Comprehensive REST API (places, journeys, departures, arrivals, line reports)
Relevance to GIST: Full-featured transit API platform, not just a routing engine. Could serve as a data serving layer for transit information. AGPL license may be a consideration.
2.5 r5 (Rapid Realistic Routing on Real-world and Reimagined networks)
Repository: https://github.com/conveyal/r5 License: MIT Language: Java
Description: High-performance routing engine by Conveyal, designed for transportation planning analysis (accessibility analysis, scenario comparison).
Capabilities:
- Extremely fast for computing travel times from many origins to many destinations
- Designed for accessibility analysis (how many jobs/people reachable within X minutes?)
- Scenario analysis (what if we add a new transit line?)
- GTFS and OSM-based
- Not designed as an API routing engine (batch analysis focus)
Relevance to GIST: Excellent for analytical queries about transport accessibility and equity. Could power "Transport Angel" analytical features (e.g., "How accessible is this neighborhood by transit?").
3. Data Aggregation and Catalogs
3.1 Transitland
Website: https://www.transit.land/ Repository: https://github.com/interline-io/transitland-lib (core library) Operator: Interline Technologies License: Various (library: GPLv3, data: varies by feed)
Description: The most comprehensive open transit data platform. Aggregates GTFS feeds from thousands of transit agencies worldwide.
Key components:
- Transitland Atlas: Curated registry of GTFS feed URLs (DMFR format). Currently tracks 2,500+ feeds from 60+ countries.
- Transitland API v2: GraphQL and REST API for querying aggregated transit data (stops, routes, agencies, departures, operators).
- Transitland Onestop IDs: Stable identifiers for transit entities across feeds and time. Format:
o-9q9-bayarearapidtransit(operator),s-9q9p1wrkkk-downtownberkeley(stop),r-9q9-blue(route). - transitland-lib: Go library for GTFS processing, validation, and database import.
Data model: Feeds --> Operators --> Routes --> Stops --> Schedules. Links to Onestop IDs for cross-feed identification.
Relevance to GIST:
- Critical resource: Transitland's feed registry is the best starting point for discovering global GTFS feeds
- Onestop IDs could serve as (or inspire) GIST's stable entity identifier system
- API provides a model for how to serve aggregated transit data
- Data can be imported into GIST's own database
3.2 MobilityData Tools
Organization: https://mobilitydata.org/ GitHub: https://github.com/MobilityData
Key tools:
-
GTFS Validator (https://github.com/MobilityData/gtfs-validator): Canonical open-source GTFS validation tool. Java-based. Validates both structure and content quality. Run by MobilityData, the GTFS governance organization. Essential for GIST's data quality pipeline.
-
GBFS Validator (https://github.com/MobilityData/gbfs-validator): Validates GBFS feeds against the specification.
-
GTFS Realtime Validator (https://github.com/MobilityData/gtfs-realtime-validator): Validates GTFS-RT feeds against both the spec and a corresponding GTFS Schedule feed.
-
Mobility Database (https://database.mobilitydata.org/): Catalog of GTFS and GBFS feeds worldwide. Community-maintained. Successor to OpenMobilityData/TransitFeeds.
Relevance to GIST: MobilityData tools are essential for data quality. The Mobility Database is a key source for feed discovery alongside Transitland.
3.3 OpenMobilityData (formerly TransitFeeds)
Website: https://openmobilitydata.org/ (legacy: transitfeeds.com)
Description: Community-maintained catalog of GTFS and GTFS-RT feeds. Being superseded by MobilityData's Mobility Database but still contains many feeds.
3.4 National Access Points (NAPs)
Each EU country operates a National Access Point for transport data. Key ones:
| Country | NAP | Key Data |
|---|---|---|
| France | transport.data.gouv.fr | GTFS, NeTEx, GBFS, DATEX II |
| Germany | Mobilithek (formerly MDM/mCLOUD) | GTFS, NeTEx, DATEX II |
| Netherlands | NDOV/OVapi | NeTEx, SIRI, GTFS |
| Norway | Entur | NeTEx, SIRI (national journey planner) |
| Sweden | Trafiklab | GTFS, NeTEx, SIRI |
| UK | Bus Open Data Service (BODS) | TransXChange, SIRI, GTFS |
| Spain | NAP-Spain | NeTEx, DATEX II |
| Finland | Finap/Digitransit | GTFS, GBFS |
| Belgium | transportdata.be | NeTEx, SIRI |
Relevance to GIST: NAPs are authoritative sources for European transport data. Some provide APIs; others provide file downloads. Essential for building European coverage.
4. Street and Infrastructure Data Tools
4.1 OpenStreetMap Ecosystem
Core tools:
-
Overpass API (https://overpass-api.de/): Query engine for extracting specific OSM data. Supports complex spatial and attribute queries. Essential for extracting transport infrastructure from OSM.
- Example: Find all bus stops in Berlin:
[out:json];area["name"="Berlin"]->.a;node["highway"="bus_stop"](area.a);out body;
- Example: Find all bus stops in Berlin:
-
osmium (https://osmcode.org/osmium-tool/): Fast command-line tool for working with OSM data. Filter, extract, merge, sort OSM files. C++ with Python bindings (pyosmium).
-
osm2pgsql (https://osm2pgsql.org/): Import OSM data into PostGIS. The standard tool for creating a routable/queryable OSM database. Supports Flex output with custom Lua transformations.
-
osm2pgrouting: Import OSM into pgRouting-ready schema (nodes and edges with cost attributes).
-
Geofabrik extracts (https://download.geofabrik.de/): Pre-extracted OSM data by region in PBF and Shapefile formats. Updated daily.
-
OpenMapTiles (https://openmaptiles.org/): Schema and tooling for creating vector tiles from OSM data. Includes transport-relevant layers (transportation, transit).
-
Protomaps (https://protomaps.com/): Modern OSM-based tile system using PMTiles format. Serverless tile serving.
4.2 SharedStreets
Repository: https://github.com/sharedstreets Organization: SharedStreets (Open Transport Partnership)
Description: A shared reference system for streets. Provides stable identifiers for street segments that work across different geographic databases (OSM, government road networks, commercial maps).
Key concepts:
- SharedStreets Reference IDs: Deterministic IDs derived from geometry, allowing matching across different data sources
- SharedStreets Geometry IDs: Based on the physical geometry of roads
- SharedStreets Intersection IDs: Stable intersection identifiers
Tools:
sharedstreets-js: JavaScript library for generating and working with SharedStreets referencessharedstreets-builder: Generate SharedStreets tiles from OSM
Relevance to GIST: SharedStreets provides a way to link street-level data from different sources (traffic data from DATEX II, infrastructure from OSM, shared mobility from MDS) using stable identifiers. Useful for cross-referencing street-level data.
4.3 OpenAddresses
Website: https://openaddresses.io/ Repository: https://github.com/openaddresses/openaddresses
Description: Global open dataset of address points. Over 1 billion addresses from official government sources.
Relevance to GIST: Address-level geocoding for transport facilities, origins, and destinations.
5. Real-Time Data Processing Tools
5.1 GTFS-RT Processing
-
gtfs-realtime-bindings: Official Protocol Buffer bindings for GTFS-RT in Java, Python, JavaScript, Go, .NET, Ruby, PHP. Essential for parsing GTFS-RT feeds.
-
gtfs-rt-validator: MobilityData's validator for checking GTFS-RT feed quality.
-
transitclock (https://github.com/TheTransitClock/transitclock): Open-source arrival time prediction system. Ingests GTFS-RT vehicle positions and generates improved arrival predictions using historical data and machine learning.
5.2 AIS Data Processing
-
pyais (https://github.com/M0r13n/pyais): Python library for decoding AIS messages. Supports NMEA sentence parsing and AIS message decoding.
-
libais (https://github.com/schwehr/libais): C++ library with Python bindings for AIS message decoding. More performant than pyais for high-volume processing.
-
AISHub (https://www.aishub.net/): Community platform for sharing AIS data from local receivers.
-
OpenSky Network (https://opensky-network.org/): Open ADS-B/Mode S data network for aviation. Provides APIs and historical data. The aviation equivalent of AIS data aggregation.
5.3 Stream Processing
-
Apache Flink (https://flink.apache.org/): Distributed stream processing framework. Excellent for complex event processing over transport data streams (delay detection, anomaly detection, pattern matching). Supports SQL over streams.
-
Apache Kafka Streams: Lightweight stream processing library built into Kafka. Good for simpler transformations (format conversion, filtering, aggregation).
-
Benthos / Redpanda Connect (https://github.com/redpanda-data/connect): Declarative stream processing. Good for connecting diverse transport data sources to processing pipelines with minimal code.
6. Transport Analysis and Planning Tools
6.1 Conveyal Analysis
Repository: https://github.com/conveyal/analysis-ui (frontend), https://github.com/conveyal/r5 (backend) License: MIT
Description: Web-based accessibility analysis platform by Conveyal. Computes travel time from any point to every other point in a region, enabling accessibility analysis.
Capabilities:
- Regional accessibility analysis
- Scenario comparison (modify transit networks, see impact)
- Isochrone generation
- Equity analysis (overlay with demographic data)
- Uses r5 engine for fast computation
Relevance to GIST: Could power analytical features of Transport Angel (accessibility analysis, equity metrics, scenario planning).
6.2 A/B Street
Repository: https://github.com/a-b-street/abstreet License: Apache 2.0 Language: Rust
Description: Traffic simulation and street redesign tool. Allows editing streets (add bike lanes, change parking, modify signals) and simulating the impact.
Relevance to GIST: Interesting for scenario analysis and urban transport planning features, but more of a simulation tool than a data platform.
6.3 MATSim
Repository: https://github.com/matsim-org/matsim-libs License: GPL v2 Language: Java
Description: Multi-Agent Transport Simulation. Large-scale agent-based transport demand simulation. Used in academic and planning contexts.
Relevance to GIST: Relevant for transport modeling and simulation, less for real-time data aggregation.
6.4 urbanaccess
Repository: https://github.com/UDST/urbanaccess License: AGPL v3 Language: Python
Description: Python library for computing transit/walk accessibility metrics from GTFS data. Creates integrated network datasets from GTFS and OSM for accessibility analysis using Pandana.
6.5 peartree
Repository: https://github.com/kuanb/peartree License: MIT Language: Python
Description: Converts GTFS feeds into directed networkx graphs, enabling network analysis of transit systems (graph theory metrics, connectivity analysis).
6.6 gtfs-segments
Repository: https://github.com/UTEL-UIUC/gtfs_segments License: MIT Language: Python
Description: Creates route segments from GTFS data for analysis, visualization, and comparison.
7. GTFS Data Libraries and Tools
7.1 Python
- gtfs-kit (https://github.com/mrcagney/gtfs_kit): Python library for analyzing GTFS feeds. Load, validate, compute statistics, visualize.
- partridge (https://github.com/remix/partridge): Fast, memory-efficient GTFS loading in Python. Filters by date, route, agency.
- gtfs-realtime-bindings: Python protobuf bindings for GTFS-RT.
- mobility-db-api: Python client for the MobilityData catalog.
7.2 JavaScript / TypeScript
- gtfs (npm package by BlinkTag): Node.js GTFS import, export, and manipulation. Imports GTFS to SQLite.
- gtfs-to-geojson: Convert GTFS shapes and stops to GeoJSON.
- gtfs-to-html: Generate HTML timetables from GTFS data.
- node-gtfs: Another Node.js GTFS library.
7.3 Go
- transitland-lib (https://github.com/interline-io/transitland-lib): Go library for GTFS, GTFS-RT, GBFS processing. Used by Transitland.
7.4 Rust
- gtfs-structures (https://github.com/rust-transit/gtfs-structure): Rust library for parsing GTFS feeds.
7.5 Database Loaders
- gtfs-via-postgres (https://github.com/public-transport/gtfs-via-postgres): Import GTFS into PostgreSQL with full SQL query support.
- gtfsdb (https://github.com/OpenTransitTools/gtfsdb): Python/SQLAlchemy-based GTFS database loader.
8. NeTEx and SIRI Tools
Open-source tooling for NeTEx and SIRI is more limited than for GTFS:
8.1 NeTEx Tools
- netex-java-model (https://github.com/entur/netex-java-model): Java JAXB model for NeTEx, maintained by Entur (Norway). Essential for parsing NeTEx XML in Java.
- chouette (https://github.com/enroute-mobi/chouette): French open-source platform for managing and converting transit data. Supports NeTEx, GTFS, Neptune. Can convert between formats. Maintained by enRoute (formerly AFIMB/Cerema).
- netex-validator-java (https://github.com/entur/netex-validator-java): NeTEx validation tool by Entur.
- greenlight (Entur): NeTEx validation service (Nordic profile).
8.2 NeTEx <--> GTFS Conversion
- netex-to-gtfs (various implementations): Entur maintains tools for converting Nordic NeTEx to GTFS. Chouette supports NeTEx-to-GTFS conversion.
- gtfs-to-netex: Less common direction, but chouette supports it. Lossy conversion (GTFS lacks many NeTEx concepts).
8.3 SIRI Tools
- siri-java-model (Entur): Java JAXB model for SIRI.
- siri-sx-to-gtfs-rt: Convert SIRI Situation Exchange to GTFS-RT alerts.
- onebusaway-siri (https://github.com/OneBusAway/onebusaway-siri): Java SIRI client library.
8.4 TransXChange Tools
- transxchange2gtfs (https://github.com/planarnetwork/transxchange2gtfs): Convert UK TransXChange data to GTFS. TypeScript.
- Various UK-specific converters maintained by UK DfT and ITO World.
9. Visualization and Dashboards
9.1 Transport-Specific Visualization
-
Kepler.gl (https://kepler.gl/): Geospatial visualization tool (see geospatial-tech.md for details). Particularly good for transport data exploration.
-
Unfolded Studio (now Foursquare Studio): Commercial evolution of Kepler.gl with collaboration features.
-
Deck.gl (https://deck.gl/): Low-level visualization library (see geospatial-tech.md). The TripsLayer is specifically designed for vehicle movement visualization.
-
Grafana + Geospatial plugins: Open-source dashboarding with map panels. Good for monitoring transport system health (delays, vehicle counts, availability). Grafana Geomap panel supports WMS, GeoJSON, and various basemaps.
-
Apache Superset (https://superset.apache.org/): Open-source BI/analytics platform with geospatial visualization support (Deck.gl integration). Good for building transport analytics dashboards.
9.2 Timetable and Schedule Visualization
- gtfs-to-html (BlinkTag): Generate HTML timetables from GTFS feeds.
- transit-map (https://github.com/juliuste/transit-map): Generate schematic transit maps.
- d3-tube-map: D3.js plugin for schematic tube/metro maps.
9.3 Accessibility Visualization
- Mapnificent (https://www.mapnificent.net/): Visualizes transit travel time isochrones for cities worldwide. Based on GTFS data.
10. Data Quality and Validation
10.1 GTFS Validation
- MobilityData GTFS Validator (canonical): https://github.com/MobilityData/gtfs-validator
- gtfsvtor (Mecatran): Alternative GTFS validator
- transport-validator (French NAP): Validator used by transport.data.gouv.fr
10.2 Spatial Data Validation
-
GDAL/OGR (https://gdal.org/): The Swiss Army knife for geospatial data. Format conversion (ogr2ogr), validation, reprojection, spatial operations. Supports nearly every geospatial format. Essential tool for GIST.
-
Shapely (Python): Geometric operations and validation.
-
Turf.js (JavaScript): Geospatial analysis functions for the browser/Node.js.
10.3 General Data Quality
-
Great Expectations (https://greatexpectations.io/): Python-based data quality framework. Define expectations (e.g., "all stops have valid coordinates", "all trip_ids reference existing routes"), validate data against them.
-
Soda (https://www.soda.io/): Data quality monitoring with SQL-based checks.
-
dbt tests: Built-in and custom data quality tests in the dbt transformation framework.
11. Journey Planning Platforms
11.1 Digitransit (Finland)
Repository: https://github.com/HSLdevcom/digitransit-ui License: Various (EUPL, MIT)
Description: Open-source national journey planner platform developed by HSL (Helsinki Regional Transport). Powers Finland's national journey planner.
Architecture: OTP2 for routing, Pelias for geocoding, GTFS/GBFS data, React frontend with MapLibre.
Relevance to GIST: A complete, production-grade open-source journey planner. Demonstrates how to build a national-scale system. Good architectural reference.
11.2 OneBusAway
Repository: https://github.com/OneBusAway License: Apache 2.0
Description: Open-source platform for real-time transit information. Originally developed at University of Washington. Provides APIs, web interface, and mobile apps for transit arrival information.
Relevance to GIST: Good model for real-time transit information delivery. Supports GTFS, GTFS-RT, SIRI.
11.3 OpenTripPlanner (discussed in Section 2.1)
11.4 Entur (Norway)
Open-source components: https://github.com/entur
- Journey planner (OTP2-based)
- NeTEx/SIRI tooling
- Stop place registry (Abzu)
- Data import/export tools
Relevance to GIST: Entur is the most comprehensive national open-source transport data platform. Excellent model for a national-to-global system. Their NeTEx tooling is the best available.
12. Data Format Conversion Tools
| Source Format | Target Format | Tool |
|---|---|---|
| GTFS --> NeTEx | netex-java-model, chouette | Lossy: NeTEx has many more fields |
| NeTEx --> GTFS | chouette, entur tools | Lossy: significant detail lost |
| TransXChange --> GTFS | transxchange2gtfs | Good for UK bus data |
| OSM --> Routing graph | osm2pgsql, osm2pgrouting, Valhalla | Multiple approaches |
| OSM --> Vector tiles | OpenMapTiles, tippecanoe | Standard pipeline |
| GTFS --> GeoJSON | gtfs-to-geojson | Shapes and stops |
| GeoJSON --> Vector tiles | tippecanoe | Standard pipeline |
| Shapefile --> GeoJSON/GeoPackage | ogr2ogr (GDAL) | Universal converter |
| SIRI --> GTFS-RT | siri-to-gtfs-rt converters | Partial mapping |
| AIS NMEA --> JSON/CSV | pyais, libais | Message decoding |
| Any spatial --> Any spatial | GDAL/OGR (ogr2ogr) | Universal Swiss army knife |
13. Emerging and Noteworthy Projects
13.1 Overture Maps Foundation
Website: https://overturemaps.org/ Members: Amazon, Meta, Microsoft, TomTom, and others.
Description: Building an open map dataset combining multiple sources (OSM, commercial contributors, government data). Releases include:
- Transportation theme (road network with stable segment IDs)
- Places theme
- Buildings theme
- Administrative boundaries theme
Format: GeoParquet, distributed via cloud storage.
Relevance to GIST: Overture's transportation theme provides a high-quality road network with stable IDs (similar to SharedStreets concept). Could serve as a reference network layer for GIST.
13.2 Protomaps
Website: https://protomaps.com/ Repository: https://github.com/protomaps
Description: Open-source map stack: PMTiles (serverless tile format) + basemap styles + tile generation. Designed for simplicity and cost-effective self-hosting.
Relevance to GIST: PMTiles format is ideal for GIST's tile serving infrastructure. Serverless, cost-effective, simple.
13.3 felt/tippecanoe
Repository: https://github.com/felt/tippecanoe License: BSD-2-Clause
Description: Build vector tilesets from GeoJSON features. The standard tool for generating optimized vector tiles. Now maintained by Felt.
Relevance to GIST: Essential for generating transport network vector tiles.
13.4 DuckDB Spatial
Repository: https://github.com/duckdb/duckdb_spatial
Description: Spatial extension for DuckDB. Brings spatial operations to DuckDB's analytical query engine. Reads/writes GeoJSON, GeoPackage, Shapefile, GeoParquet.
Relevance to GIST: Ideal for analytical spatial queries (aggregation, statistics, batch processing). Complements PostGIS (which is better for transactional/real-time queries).
13.5 Martin
Repository: https://github.com/maplibre/martin License: MIT / Apache 2.0 Language: Rust
Description: PostGIS vector tile server. Serves Mapbox Vector Tiles directly from PostGIS tables and functions. Very fast.
Relevance to GIST: Enables real-time vector tile generation from the PostGIS transport database. Key component for dynamic map visualization.
14. Tool Selection Matrix for GIST
14.1 Core Infrastructure
| Need | Primary Tool | Alternative | Rationale |
|---|---|---|---|
| Spatial database | PostGIS | -- | Standard, mature, extensions (pgRouting, pgvector) |
| Analytical database | DuckDB Spatial | -- | Fast analytics, GeoParquet native |
| Message broker | Apache Kafka | Redis Streams | High throughput, replay, partitioning |
| Search engine | Elasticsearch/OpenSearch | -- | Full-text + geospatial |
| Orchestration | Dagster | Apache Airflow | Modern, data-aware |
| Transformation | dbt | -- | SQL-based, testable |
| Basemap | MapLibre GL JS | -- | Open source, high performance |
| Visualization | Deck.gl | -- | GPU-accelerated, transport layers |
| Tile serving | PMTiles + Martin | -- | Static + dynamic |
| Geocoding | Pelias | Photon | Modular, multi-source |
14.2 Data Processing
| Need | Primary Tool | Alternative | Rationale |
|---|---|---|---|
| GTFS loading | gtfs-via-postgres / transitland-lib | partridge (Python) | Direct PostgreSQL import |
| GTFS validation | MobilityData GTFS Validator | -- | Canonical validator |
| NeTEx processing | netex-java-model + chouette | -- | Best available NeTEx tooling |
| OSM processing | osmium + osm2pgsql | -- | Standard pipeline |
| Format conversion | GDAL/OGR | -- | Universal converter |
| AIS decoding | pyais / libais | -- | AIS message parsing |
| Tile generation | tippecanoe | -- | Standard tool |
| Stream processing | Kafka Streams / Flink | -- | Simple vs complex needs |
14.3 Routing and Analysis
| Need | Primary Tool | Alternative | Rationale |
|---|---|---|---|
| Transit routing | OpenTripPlanner 2 | Navitia | Widest adoption, NeTEx support |
| Road routing | Valhalla | OSRM | More features, still fast |
| Accessibility analysis | r5 (Conveyal) | -- | Purpose-built for this |
| Data quality | Great Expectations + dbt tests | -- | Comprehensive quality framework |
14.4 AI / LLM
| Need | Primary Tool | Alternative | Rationale |
|---|---|---|---|
| Agent framework | LangGraph | LlamaIndex | Stateful agents, tool calling |
| Vector store | pgvector | Qdrant | Same DB as spatial data |
| Text-to-SQL | Vanna.ai + custom | LangChain SQL tools | Learns from your schema |
| LLM | Claude / GPT-4 | Open-source (Llama 3) | Quality for transport queries |
15. Data Source Discovery Checklist
When building GIST, discover data sources through:
- Transitland Atlas: https://www.transit.land/feeds -- global GTFS feed registry
- MobilityData Mobility Database: https://database.mobilitydata.org/ -- global GTFS/GBFS catalog
- EU National Access Points: Listed per country (see Section 3.4)
- OpenStreetMap: Global infrastructure data
- GBFS feeds: Listed on gbfs.org and MobilityData
- MarineTraffic / AISHub: AIS vessel data
- OpenSky Network: ADS-B aviation data
- OAG / Cirium: Aviation schedule data (commercial)
- Overture Maps: Open road network data
- Geofabrik: OSM regional extracts
- National statistical offices: Transport statistics
- Wikidata: Transport entity metadata and identifiers
16. References
- OpenTripPlanner: https://www.opentripplanner.org/
- Valhalla: https://github.com/valhalla/valhalla
- Transitland: https://www.transit.land/
- MobilityData: https://mobilitydata.org/
- Entur: https://developer.entur.org/
- Digitransit: https://digitransit.fi/en/
- SharedStreets: https://sharedstreets.io/
- Overture Maps: https://overturemaps.org/
- Conveyal: https://conveyal.com/
- OneBusAway: https://onebusaway.org/
- GDAL: https://gdal.org/
- Martin: https://martin.maplibre.org/
- Protomaps: https://protomaps.com/
- DuckDB: https://duckdb.org/
- Apache Superset: https://superset.apache.org/
- Kepler.gl: https://kepler.gl/
- Deck.gl: https://deck.gl/
- OSM tools: https://wiki.openstreetmap.org/wiki/Software