Skip to content
Nate Wessel edited this page Dec 9, 2017 · 7 revisions

What it is

The goal of this project is to use real-time transit fleet location data to create a 'retroactive' GTFS 'schedule' of an agency's observed transit service. Realtime location data is pulled in from the NextBus API, which is used by quite a few different agencies at the moment.

How the code Works

The NextBus API reports all fleet vehicle locations which have updated since some given time. We ping the API about every 10 seconds requesting at t2 all updates since t1. These are stored in a PostGIS database. Vehicle location reports are partitioned into distinct, ordered trips and blocks. A trip is an ordered sequence of vehicle reports, basically a GPS trace. These would get very long and go back and forth a lot except that we break sequences into new trips when a vehicle does one of the following:

  • Goes off the radar for more than some amount of time
  • Changes it's headsign
  • Changes it's route_id

Blocks are sequences of consecutive trips. A new block is started only if a vehicle fails to report it's location in a timely manner. Once a trip ends, it is sent off for processing. First the trip is cleaned and simplified by removing redundant, co-located points at the start or end. These mostly come about because of long in-station dwell-times. Next, the trip is map-matched using OSRM and data from OpenStreetMap. The data we've been working with have a 20-second delay between location updates and map-matching lets us estimate a more realistic route geometry.

We get schedule data for each route from the NextBus API. From this we find the set of stops which are expected for the given route_id and headsign/direction. We now have a trip path with points in space and in time (from the location report time), and a set of stops. Any stop within x meters of the trip geometry is 'snapped' to the nearest point on the line and the time of that point is interpolated from the nearest useable vehicle location timestamps.

Stop times and trips are stored in the PostGIS DB, from whence they can be extracted in the form of regular GTFS dataset with included scripts.

Clone this wiki locally