A Python 3.9+ library to build GTFS feeds from basic route information. Inspired by Conveyal's geom2gtfs. Makes naive timetables, but they are often good enough for preliminary work.
- Alex Raichev (maintainer), 2014-09
To use as a library in your own project (called something other than make_gtfs
), make a Python 3.9+ virtual environment for your project, then run poetry add make_gtfs
.
To develop the make_gtfs
repo, Git clone it, make a Python 3.9+ virtual environment, then run poetry install --no-root && pre-commit install
.
Use as a library, or use from the command line by typing make_gtfs --help
and following the instructions.
Make GTFS uses the following files to build a GTFS feed.
meta.csv
(required). A CSV file containing network metadata. The CSV file contains the following columns.agency_name
(required): string; the name of the transport agencyagency_url
(required): string; a fully qualified URL for the transport agencyagency_timezone
(required): string; timezone where the transit agency is located; timezone names never contain the space character but may contain an underscore; refer to http://en.wikipedia.org/wiki/List_of_tz_zones for a list of valid valuesstart_date
,end_date
(required): strings; the start and end dates for which all this network information is valid formated as YYYYMMDD strings
shapes.geojson
(required). A GeoJSON file containing route shapes. The file comprises one feature collection of LineString features, where each feature's properties contains at least the attributeshape_id
. Each LineString should represent the run of one representive trip of a route. In particular, the LineString should not traverse the same section of road many times, unless you want a trip to actually do that.service_windows.csv
(required). A CSV file containing service window information. A service window is a time interval and a set of days of the week during which all routes have constant service frequency, e.g. Saturday and Sunday 07:00 to 09:00. The CSV file contains the following columns.service_window_id
(required): string; a unique identifier for a service windowstart_time
,end_time
(required): string; the start and end times of the service window in HH:MM:SS format where the hour is less than 24monday
,tuesday
,wednesday
,thursday
,friday
,saturday
,sunday
(required); 0 or 1; indicates whether the service is active on the given day (1) or not (0)
frequencies.csv
(required). A CSV file containing route frequency information. The CSV file contains the following columns.route_short_name
(required): string; a unique short name for the route, e.g. '51X'route_long_name
(required): string; full name of the route that is more descriptive thanroute_short_name
route_type
(required): integer; the GTFS type of the routeservice_window_id
(required): string; a service window ID for the route taken from the fileservice_windows.csv
direction
(required): 0, 1, or 2; indicates whether the route travels in the direction of its shape (1), or in the reverse direction of its shape (0), or in both directions (2); in the latter case, trips will be created that travel in both directions along the route's shape, each direction operating at the given frequency; otherwise, trips will be created that travel in only the given directionfrequency
(required): integer; the frequency of the route during the service window in vehicles per hour.shape_id
(required): string; a shape ID that is listed inshapes.geojson
and corresponds to the linestring of the (route, direction, service window) tuplespeed
(optional): float; the average speed of the route in kilometers per hour
Missing speed values will be filled with values from the library's dictionary SPEED_BY_RTYPE.
speed_zones.geojson
(optional). A GeoJSON file of Polygons representing speed zones for routes. The file consists of one feature collection of Polygon features (in WGS84 coordinates), each with the propertiesspeed_zone_id
(required): string; a unique identifier of the zone polygon; can be re-used if the polygon is re-usedroute_type
(required): integer; a GTFS route type to which the zone appliesspeed
(required): positive float; the average speed in kilometers per hour of routes of that route type that travel within the zone; overrides route speeds infrequencies.csv
within the zone.
stops.csv
(optional). A CSV file containing all the required and optional fields ofstops.txt
in the GTFS.
Basically,
routes.txt
is created fromfrequencies.csv
.agency.txt
is created frommeta.csv
.calendar.txt
is created in a dumb way with exactly one all-week service that applies to all trips.shapes.txt
is created fromshapes.geojson
.stops.txt
is created fromstops.csv
if given. Otherwise it is created by making a pair of stops for each shape, one stop at each endpoint of the shape and then deleting stops with duplicate coordinates. Note that this yields only one stop for each shape that is a loop.trips.txt
andstop_times.txt
are created by taking each route, service window, and direction, and running a set of trips starting on the hour and operating at the route's speed and frequency specified for that service window. If the route direction is 2, then two sets of trips in opposing directions will be created, each operating at the route's frequency. Assign stops to each trip as follows. Collect all stops in the built filestops.txt
that are within a fixed distance of the traffic side (e.g. the right hand side for USA agency timezones and the left hand side for New Zealand agency timezones) of the trip shape. If the trip has no nearby stops, then do not make stop times for that trip.- Once validated, write these files to disk by running command
feed.write("gtfsfile.zip")
.
See data/auckland
for example files and play with the Jupyter notebook at notebooks/examples.ipynb
.
On Github pages here.
- This project's development status is Alpha. Alex and Danielle use this project for work and change it breakingly when it suits their needs.
- This project uses semantic versioning.
- Thanks to MRCagney for periodically funding this project.
- Switched from Poetry to UV for project management.
- Bumped to Python 3.10+.
- Fixed some Pandas deprecation warnings.
- Updated dependencies.
- Upgraded Python to 3.9+.
- Tweaked some validators.
- Updated README.
- Updated dependencies and pre-commit hooks.
- Removed most type coercion in validation. Probably more instructive for the user that way.
- Fixed Issue 11.
- Bugfix: Changed
make_stop_points
to correctly respect theoffset
parameter.
- Bugfix: Created proper default speed zones when creating ProtoFeeds without given speed zones.
- Clarified README docs some.
- Bugfix: Propogated
stop_offset
parameter inbuild_feed
down the function chain.
- Speeded up
make_stop_points
whenoffset
is zero.
- Offset built stops to the traffic side of each shape.
- Breaking change: renamed some function parameters.
- Extended
build_stops()
andbuild_feed()
to to build a specified number of equally spaced stops on each built shape or to build stops with a specified spacing on each built shape. More specifically, stops will be built on a shape and not also on its antiparallel clone, if that exists. That way we avoid building duplicate stops.
- Removed the option to set default speeds by route type as overly complex.
- Added speed zones to override route speeds in user-specified geographic zones.
- Refactored to use a dataclass and updated the docstrings, adding some type hints.
- Added the ability to specify default speeds by route type in
meta.csv
. - Simplified validation with Pandera schemas.
- Updated dependencies.
- Updated dependencies and removed version caps.
- Replaced Travis CI with Github Actions.
- Re-added support for Python 3.8.
- Upgraded to Python 3.9, dropped support for Python <3.9, and updated dependencies.
- Switched to Python 3.6+ and Poetry.
- Bugfix: Change
build_feed()
to use GTFS Kit'sdrop_zombies()
method to delete unnecessary stops etc.
- Extended to handle optional input stops
- Wrote ProtoFeed validation
- Modularized code more
- Restructured code and used GTFSTK, Click, Pytest, Pipenv
- Fixed a bug in
create_stop_times()
that crashed when given a zero frequency.
- Added direction field and renamed
routes.csv
tofrequencies.csv
. - Simplified the code some too.
- Eliminated stops and trips for routes that have no linestrings
- Changed from headways to frequencies and replaced
config.json
with CSV files
- Changed
config.json
spec to account for active days
- Finished writing first set of tests and packaged